What are those diagrams ?
They show dependencies between the internal modules of various well-known Python libraries.
They goal is to provide a global overview of a Python project architecture, as a map of modules & packages, the top-level code abstractions.
Note that all module names in those diagrams are HTML links to the actual source code on GitHub.
At work, we did a short technical-debt review of one of our Python services, and a co-worker reported a lack of documentation to provide a clear overview of the code structure, for first-time contributors to easily jump in.
Hence, last week I searched for some helpful code visualization recipes to provide such insight to our code base, hoping to find an easy-to-setup Python module that would do the job.
I did not find any off-the-shelf package for my need (although I'd love your suggestions if you know some !), but discovered Francois Zaninotto's DependencyWheel visualization of dependencies, and decided to use it to build a nice diagram and add it to our documentation.
I thought it could be useful to others, hence this blog post to share the recipe online.
Following the spirit of "Modern Technical Writing" / "Literate programming" / "Living Documentation", our documentation for this project at work is written in Markdown and compiled with mkdocs to provide a static website. Moreover, the project is built & hosted by GitLab Pages.
This way, the diagram is always up-to-date with the project code. It also made the addition of this diagram quite easy:
- I added some code to the GitLab Pages build script to fetch the corresponding git repo and extract the modules dependencies as JSON.
The script to extract the modules dependencies is on GitHub: gen_modules_graph.py. It is less than 100 lines and use the modulegraph package to parse modules dependencies, taking care to:
- ignore modules outside of the target project
- ignore constants, functions and modules with the zero incoming & outgoing dependencies (like Python packages with an empty
gen_modules_graph.py ansible.inventory.manager ansible.playbook ansible.executor.task_queue_manager > modules-ansible.json
For the rendering, I used fzaninotto/DependencyWheel, originally written to display the external dependencies of a project (e.g. links between PHP composer packages). I made 2 small patches / PRs to the latest version of this project:
- a single-line code change to allow for colors customization
- another minor change to make the chart adaptive to the parent DOM element width
I also used some additional JS code to:
- ensure the dependencies matrix is square (to get prettier graphs)
- customize the colors (cf. below)
- add HTML anchor links
The code is available in this page source. Like the Python script, you are free to reuse it at will.
It is relatively straightforward, with a single notable trick: the conversion from a Python module path to a hue color value on a 360 degrees scale.
A little bit of maths
In order for modules with a shared ancestor to have close colors (like
http.response.text in the
scrapy wheel above),
I used a simple mathematical concept: decomposing the hue value with a bijective numeration
into a fixed-size string of digits.
This idea is similar to the binary numeral system, notably with the same concept of most / least significant digits,
except that the final range covered is
[0, 360] and we want as many digits as the module tree depth.
Once this numeral system base radix is computed from those 2 constraints, computing the hue value is simply a matter of a basic exponentiation :
`"Let's consider a module tree of depth " D "."` `"Then the base radix to use in our decomposition is " R = 360^(1 / D)` `"Now, let " m " be a module path, constituted of " d " modules names " m_i ", with " d <= D "."` `"We can define " pos(m_i) " to be the position of the module name " m_i " in the sorted list of its parent module children,"` `" and " parentModCount(m_i) " to be the number of children modules for its parent."` `"We can now compute the digits of " m " in our decomposition: " d_(m_i) = (pos(m_i)) / (parentModCount(m_i)) * (R - 1)` `"And then " hue(m) = sum_(i=1)^D d_(m_i)*R^(D-i)`