thebjorn/pydeps

Group all submodules into its module

Opened this issue ยท 12 comments

Background:
I have too many submodules and this makes difficult to have a quick overview of the diagram.

I would like to be able of foldind/grouping submodules so that I dont get overloaded of information. I need that a module and all of its children be treated the module itself. This would allow me to have a more coarse view, instead of overwhelming fine-grainess.

For instance, in the following diagram:

image

I would like that app.routers, app.routers.tama_job, app.routers.msn, app.routers.program and app.routers.upload_data all become a single app.routers, and that the arrows change correspondingly in a way that if at least one of the submodules had an arrow to another module MODULE or its submodules, the app.routers would also have an arrow to MODULE.

I expected that it would happened because I put this on the .pydeps file:

[pydeps]
max_bacon = 2
no_show = True
show_dot = True
verbose = 0
pylib = False
reverse = True
exclude =
    app.tests
only = 
    app
    app.main
    app.models
    app.repos
    app.routers
    app.schemas
    app.services
    app.storage

I thought that it would not show submodules of, say, app.routers, because otherwise I would have listed then one per one. It would be nice if the graph showed the same fine-grainess or coarse-grainess as described in the only field, or that it could be controlled somehow

Goal:

In general, it should be allowed to group all submodules as its parent module, like this:
Before:
image

After:
image

Question/Wish

  • I didnt find a way of doing that. Is it possible currently, and how?

  • Or this would be a new feature?

Thanks!

This would be a new feature. PRs are very welcome ;-)

@thebjorn can you say a bit about how this could be implemented?

I'm looking a bit at the structure of pydeps <mod_name> --show-raw-deps, and it seems like what could happen is...

  • "collapse" that data down, so each entry is just a "module"
  • feed that new data into whatever generates a plot

Does that sound right? If you mention some of the relevant functions / modules involved, I'm willing to tinker with things..!

Example output of --show-raw-deps:

{
    "__main__": {
        "bacon": 0,
        "imports": [
            "test_mod",
            "test_mod.a",
            "test_mod.b",
            "test_mod.b.c"
        ],
        "name": "__main__",
        "path": null
    },
    "test_mod": {
        "bacon": 1,
        "imported_by": [
            "__main__",
            "test_mod",
            "test_mod.a",
            "test_mod.b"
        ],
        "imports": [
            "test_mod",
            "test_mod.a"
        ],
        "name": "test_mod",
        "path": "/Users/machow/Dropbox/Repo/pydeps/tmp/test_mod/__init__.py"
    },
...
}

Edit: it looks like it's passed in to depgraph_to_dotsrc(). Going to take a peek!

Alright--so I got something very rough working, but am sure there is a better way.

This script requires an output named types.json, that's the result of running something like...

python -m pydeps.py2depgraph some_script_with_imports.py > types.json

There are 5 parts to the script..

  1. defining a function to rename modules (e.g. a.b.c -> a.b)
  2. converting graph representation in types.json to a new graph
  3. fixing an error where the cli.verbose func doesn't exist
  4. creating a DepGraph object from new graph representation
  5. plotting
# python -m pydeps.py2depgraph script.py > types.json

import json
from collections import defaultdict
from itertools import chain

# 1. Function to do renaming of modules ----

def rename(node_name):
    # shortens a name to only include a single .
    # e.g. a.b.c -> a.b
    return ".".join(node_name.split(".")[:2])


# 2. Convert old output to new one ----

old_depgraph = json.load(open("types.json"))

old_graph = old_depgraph["depgraph"]
new_graph = defaultdict(lambda: {})

new_depgraph = {
    "types": old_depgraph["types"],
    "depgraph": new_graph
    }

all_old_nodes = chain(old_graph.keys(), *old_graph.values())
old_to_new_names = {k: rename(k) for k in all_old_nodes}
uniq_new_names = set(old_to_new_names.values())

for k, entries in old_graph.items():
    new_entries = new_graph[old_to_new_names[k]]

    for old_node, old_path in entries.items():
        new_entries[old_to_new_names[old_node]] = old_path


# 3. Fix an error where making a DepGraph tries to use cli.verbose, ------
# but it doesn't exist (unless you call via the CLI)

from pydeps import cli

cli.verbose = cli._mkverbose(1)


# 4. Create a DepGraph for the new graph -------

from pydeps.depgraph import DepGraph
import json

# TODO: args that need to be passed
# not sure how to get these, since they seem tied to the CLI
kwargs = {
        "show_cycles": False,
        "max_bacon": 2,
        "show_raw_deps": False,
        "show_deps": False,
        "exclude": [],
        "exclude_exact": [],
        "dummyname": None,
        "noise_level": 200,
        "display": None,
        }
types = json.load(open("types.json"))["types"]

dg = DepGraph(new_graph, new_depgraph["types"], **kwargs)
#DepGraph(old_graph, old_depgraph["types"], **kwargs)


# 5. Plot ----

from pydeps.pydeps import depgraph_to_dotsrc
from pydeps import dot

dotsrc = depgraph_to_dotsrc("deps.dot", dg, **kwargs)
svg = dot.call_graphviz_dot(dotsrc, "svg")

with open("out.svg", "wb") as f:
    f.write(svg)

dot.display_svg(kwargs, "out.svg")

Here's output being run on a library called siuba, which has a bunch of submodules. E.g. siuba.dply.verbs imports are consolidated into siuba.dply.

image

interesting. I'll take a deeper look at it later in the week when I get some free time :-)

Any progress on that?

I am also interested in this.

Also, interested

Sorry that this took a little while. Could you test the (undocumented) --max-module-depth flag in v1.10.19 available on PyPI?

pydeps --max-module-depth=2 packagename

It should work with the --cluster flag, but will possibly/probably mess with the --max-bacon flag and the --min/max-cluster-size flags.

Hi, thanks for your effort! I never imagined someone would implement it :).

I will test it when I can.

I saw your source code and had an inspiration which might help with this in the future:

I believe this capability could be implemented by post processing the .dot file generated, add it is not limited to python. So, maybe the original dot file could be parsed (with pydot or py graph viz) and the name merging could be done there to generate a second find, thus sparring you from having to couple this code with your original code.

In this case, your code would read the comments like options that you named max module depth and the parser would call the post processor.

I hope this can make your life easier :)

@namoscagnm it's a reasonable idea, but considerable effort is made to get the analysis into a DepGraph instance and for it to be easy to work with, so it's better to keep all graph manipulations there. To not be consistent I do realize that the cluster code is implemented in the RenderBuffer class :-D (it needs support for subgraph dot elements...)

Hi @thebjorn thank you for the change. At least for me, it works like a charm ๐Ÿ”ฅ

For me as well. Love it.