invl/pip-autoremove

missing dependencies

casperdcl opened this issue · 5 comments

First off, I'd like to say this is a great tool which makes pip almost as powerful as conda.

However, running pip-autoremove -L can incorrectly list packages which are sub-dependencies:

(clean-venv) scikit-image$ pip install Cython pip-autoremove scikit-image
(clean-venv) scikit-image$ pip install -U .
...
Collecting toolz>=0.7.3; extra == "array" (from dask[array]>=0.9.0->scikit-image)
...
(clean-venv) scikit-image$ pip-autoremove -l $(pip-autoremove -L | cut -d' ' -f1) | grep -Eo "^\s*\S+"
pip-autoremove
pip
toolz  # <-- should be under dask under scikit-image
argparse
Cython
setuptools
matplotlib
    pyparsing
    backports.functools-lru-cache
    subprocess32
    pytz
    six
    python-dateutil
    cycler
    numpy
scikit-image
    PyWavelets
        numpy
    Pillow
    networkx
        decorator
    six
    dask
Python
wsgiref
wheel
scipy  # <-- should be under scikit-image?
    numpy

Whoops, got the wrong issue, I don't think I dealt with this one, sorry

I don't think this behavior is actually wrong though. pip-autoremove relies on the package information from setuptools (ie the setup.py file.) scikit-image's package says its requirements are:

Projects/scikit-image ●» pip show scikit-image                                                                                                                                    « [master] (scikit-image)
Name: scikit-image
Version: 0.14.dev0
Summary: Image processing routines for SciPy
Home-page: http://scikit-image.org
Author: None
Author-email: None
License: Modified BSD
Location: /Users/bhartvigsen/.virtualenvs/scikit-image/lib/python2.7/site-packages
Requires: pillow, six, PyWavelets, networkx, dask
Required-by:

The additional packages installed via the requirements.txt file when you do pip install -U . are not in the setup.py so pip-autoremove can't "see" them. They are leaf nodes as far as setuptools/pip are concerned.

the command I gave should properly list such "leaf" nodes though

Okay, I went poking again, and I think I've found the underlying issue, but not sure there is anything pip-autoremove can do about it. Looks like there are 2 different ways to mark something "Required" in the METADATA, Requires: and Requires-Dist:. For scikit-image, that's the following:

Requires: numpy (>= 1.11)
Requires: scipy (>= 0.17.0)
Requires: matplotlib (>= 1.3.1)
Requires: networkx (>= 1.8)
Requires: six (>= 1.10.0)
Requires: pillow (>= 2.1.0)
Requires: PyWavelets (>= 0.4.0)
Requires: dask (>= 0.9.0)
Requires-Dist: networkx (>=1.8)
Requires-Dist: six (>=1.10.0)
Requires-Dist: pillow (>=2.1.0)
Requires-Dist: PyWavelets (>=0.4.0)
Requires-Dist: dask[array] (>=0.9.0)

WorkingSet appears to read the Requires-Dist: attributes only. For scikit-image, when installed via pip install -U . that list boils down to networkx, six, pillow, pywavelets, and dask.

toolz is not caught under dask because pip-autoremove isn't parsing/processing the extra information (array). That could possibly be fixed. But the others (matplotlib, numpy, scipy) are not seen by setuptools as actual requirements and would be considered leafs.

perhaps you could compare this behavior with the output of pipdeptree. I use the two in conjunction in order to clean up requirements.