Speed up `deptry` by using Rust
fpgmaas opened this issue · 2 comments
Is your feature request related to a problem? Please describe.
While deptry
is relatively fast, if could probably be sped up by using Rust. For example, a quick test run of deptry on aws-cli
:
Time taken to extract dependencies: 0.004 seconds
Time taken to find all python files: 0.026 seconds
Time taken to find all local and stdlib modules: 0.002 seconds
Scanning 216 files...
Time taken to find all imports: 0.190 seconds
Time taken to create the 'ModuleLocations' objects: 0.003 seconds
<omitted output>
Time taken to report: 0.003 seconds
Complete runtime: 0.228 seconds
Running deptry
on deptry
itself gives:
Time taken to detect dependency management format: 0.001 seconds
Assuming the corresponding module name of package 'types-colorama' is 'types_colorama'. Install the package or configure a package_module_name_map entry to override this behaviour.
Time taken to extract dependencies: 0.017 seconds
Time taken to find all python files: 0.033 seconds
Time taken to find all local and stdlib modules: 0.001 seconds
Scanning 45 files...
Time taken to find all imports: 0.015 seconds
Time taken to create the 'ModuleLocations' objects: 0.002 seconds
Success! No dependency issues found.
Time taken to report: 0.000 seconds
Complete runtime: 0.069 seconds
Here, we see that in a large project like aws-cli
, 83% of the time is spent on detecting the imports, i.e. reading the files, parsing the AST, traversing down the AST and then fetching all Import
and ImportFrom
nodes. In a smaller project like deptry
there does not seem to be one specific part of the application that contributes most to the duration of the run. But then again, deptry
runs within 7/100'th of a second which already sounds reasonably fast.
Describe the solution you would like
Let's see if we can speed up deptry
by using Rust. Given the output of the small test runs above, the main target to replace with Rust seems to be the import extractors in deptry/deptry/imports
.
Additional context
I will try to create an initial draft PR in the upcoming few days. I have 0 experience in Rust though, so I'll start with some tutorials and see where I get from there. Any more experienced Rust developers are welcome to contribute ;)
I am facing quite some issues trying to get Poetry & maturin to work together. In the small amount of projects that I could find that combine these two, they usually duplicate a lot of the project's metadata so it is also available in the PEP621 compatible format, see for example pyproject.toml in robyn
.
Since we likely will need to make quite some changes in our project to support maturin
, my proposal would be that we switch from Poetry to PDM to manage our dependencies.
In that case, pyproject.toml would look like this
I did some development on this over the weekend, and I just published a first draft to test PyPi, see this workflow run. The results are quite promising. For each benchmark, I set up the environment with:
git clone --depth 1 git@github.com:aws/aws-cli.git
cd aws-cli
python -m venv venv
. ./venv/bin/activate
pip install -r requirements.txt requirements-dev.txt
Benchmark deptry 0.12.0
pip install deptry==0.12.0
hyperfine -i 'deptry .' --warmup 1
hyperfine -i 'deptry .' --warmup 1
Benchmark 1: deptry .
Time (mean ± σ): 298.6 ms ± 5.7 ms [User: 274.8 ms, System: 21.9 ms]
Range (min … max): 292.4 ms … 307.5 ms 10 runs
Benchmark deptry + Rust
pip install \
--index-url https://test.pypi.org/simple/ \
--extra-index-url https://pypi.org/simple/ \
deptry==0.0.13a5
hyperfine -i 'deptry .' --warmup 1
Benchmark 1: deptry .
Time (mean ± σ): 109.3 ms ± 1.7 ms [User: 152.6 ms, System: 21.7 ms]
Range (min … max): 107.2 ms … 115.7 ms 26 runs
I did a manual check on the output to confirm that the reduced runtime is not simply because of deptry existing early on an error; in both cases, deptry
scans 219 files and find 151 dependency issues.
So, on this particular project a reduction of about 63% in runtime. Did not test on any other projects yet, but a promising start :)