Gotcha: numpy version conflict if installing in existing environment with tensorflow 2.4.1
CatChenal opened this issue · 4 comments
Problem:
When installing kglab
using pip in an existing (activated) environment, the latest version of numpy is installed (because requirements.txt includes 'numpy >= 1.19.4'). This may create conflicts with other packages.
Specific Case: latest numpy version and tensorflow 2.4.1 version conflict:
My activated env contains tensorflow 2.4.1.
Near the end of the installation process from pip install kglab
, I got this error message (abbreviated):
[...]
Installing collected packages:
[...], kglab
Attempting uninstall: numpy
Found existing installation: numpy 1.19.2
Uninstalling numpy-1.19.2:
Successfully uninstalled numpy-1.19.2
**ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow 2.4.1 requires numpy~=1.19.2, but you have numpy 1.20.2 which is incompatible.**
Successfully installed [all needed]
My fix:
pip uninstall numpy
pip install numpy==1.19.4
My (minimal) tests:
- kglab: ran "Sample Usage" code in https://derwen.ai/docs/kgl/start/ without error.
- tensorflow: ran code to "Verify the install" on https://www.tensorflow.org/install/pip.html without error.
Suggestion/Question:
Perhaps changing the numpy requirement from 'numpy >= 1.19.4' to 'numpy == 1.19.4' would force pip to install this first compatible version instead of the latest?
Thank you @CatChenal !
Yes, I've seen a related problem in my Ray tutorials where is TF causing issues with the later versions of NumPy (1.20.x)
That's the best workaround that I could see, too.
For dependencies, we prefer to pin the versions using ranges.
Would it help if we pinned to >= 1.19, < 1.20
for now?
In general I'm reluctant to place an upper bounds, since some people don't use TF and they need the latest NumPy for other integration purposes. Plus, I suspect that TF will catch up, eventually. Pandas and Arrow have some similar issues w.r.t. RAPIDS, although the latter is planning to catch up in the next release.
Also, I'll added a note in the (upcoming) FAQ
Thanks @ceteri.
Would it help if we pinned to >= 1.19, < 1.20 for now?
I would hold off for the moment:
- My 'suggestion' should have been just a question (my bad!).
- I was a bit too hasty in installing a brand new package in an existing environment (end user problem).
- For my specific TensorFlow/Numpy conflict, I now know what the requirements are in the current TF 'REQUIRED_PACKAGES'.
- Until I test
kglab
withnumpy==1.19.4
exhaustively my "fix" is just a plausible hypothesis (not even a workaround)! - I assume that fixing the upper bound of the Numpy version would require an audit of all the packages in
requirements.txt
to get their own Numpy dependency, then use the highest [a]. Is there another way?
[a]. It seems that pip is installing the highest release (i.e. Numpy 1.20.2, which is 23 days old as of this post): the audit would tell which package needs it (if any).
Perhaps a warning box in README would suffice, e.g.:
WARNING on Installing
kglab
in an existing environment:
Installing a new package in an existing environment may reveal or create version conflicts. See the requirements ofkglab
in requirements.txt before you do. A known version conflict is that of Numpy inkglab
(>= 1.19.4) and TensorFlow 2+ (~-1.19.2).
Just did a roll back of the NumPy requirement, so this should work fine with >= 1.19.2 now.
Also added your language above as a warning, along with notes about the associated PEP 517
errors that may come up.
Many thanks @CatChenal !