thebjorn/pydeps

Output result is a little bit confusing.

Closed this issue · 2 comments

I want to find all test related dependency for a python repository. When I kick off run on different directories I found the output result is a bit confusing.

For example for https://github.com/ansible/ansible.
If I run from the root of the ansible repo, I expect all dependency generated for all python files under the root recursively, including test and non-test code, but I find there is no test code dependency generated, such as for file path ansible/test/units/template/test_native_concat.py

If I run against two different sub test folders, their result is almost identical except a very few lines with name and path attribute, for example folders ./test/units/template and ./test/units/utils/collection_loader.
What I expected is each of the sub test folder should generate a dependency for the current folder, for example, run against ./test/units/template should generate only dependency for files under ./test/units/template

What I am missing here? I thought the above expectation is a common-sense on what should be the result depending on a give folder, why the current behavior? What is the most efficient way (in terms of the least amount of time that I need to run the tool, preferably only once) to generate all test related dependency data for any repository?

I run above test runs using same options: pydeps --show-raw-deps --no-output --include-missing --max-bacon 5 <targetdir>

Thanks for your insight.

Ansible is a little to big for a testcase, but generally speaking, pydeps is more targeted towards packages/modules than directories.

Pydeps (and modulefinder) works by following the import chain (really the python import bytecodes) - and it starts by creating a dummy module that contains imports of all python source files in the target.

A little known feature of the Python import system is that

 from a.b.c import d

actually imports a, a.b, a.b.c, and a.b.c.d - and thus these become new roots for pydeps to follow. If, in your two directories, you have from ansible.xxx import yyy I would expect the ansible.* tree to be pulled in in both places and the dependencies will look very similar.

The default --max-bacon=2 prunes the import tree two steps from the inital roots (i.e. the dummy module that is created). If you combine --max-bacon=1 --max-module-depth=2 it can usually give you a higher level view of how things are connected.

I'm not sure if this will help you with your real problem...?

The initial problem is how to avoid going through a lot of test folders to run pydeps upon, if it is possible to run dependency analysis on a ancestor folder to get complete dependency info, your answer explained the root cause of this behavior but couldn't help to solve the problem. I will need to look for other approach to make performance better.
Thank you for always replying with detail, that helps a lot to understand the tool. Nice work for being an owner of the tool:)