Rename rdkit-pypi to rdkit; avoid conflicts with conda
Closed this issue ยท 14 comments
Hi there, thanks for building such a great package!
It's currently possible to install rdkit from both conda and pip simultaneously. I understand that you may not have access to rdkit
in PyPI (so simply changing name
in setup.py may be off the table), but I'm wondering if there's any way to rename the package so it replaces instead of sits alongside the conda version?
Note that I don't think there are import conflicts here; the package still gets installed to the same place. But the package managers could show different things and cause uncertainty about what is actually imported. For example, I installed rdkit-pypi (2021.9.5.1) and then conda rdkit (2021.03.5).
pip list
shows:
rdkit-pypi 2021.9.5.1
while conda list
has two entries:
rdkit 2021.03.5 py39h88273a1_0 conda-forge
rdkit-pypi 2021.9.5.1 pypi_0 pypi
and rdkit.__version__
is '2021.03.5'
This is a good point. I believe the best solution is to get access to rdkit
on PyPi. I believe the maintainer is Greg (from the RDKit core team). I will contact him.
I am not sure though if this will solve the issue. I do not have much experience with Conda. Can you please try to install a python package (numpy
or so that has the same name) with conda and pip (i.e., the pip in the conda environment). Do they coexist? If so this would not help. I assume Conda always picks the conda-forge
version over the PyPi version.
I see similar behavior with numpy, so maybe the best solution is just to avoid using both conda and pip :)
pip list:
numpy 1.22.4
conda list:
numpy 1.22.4 pypi_0 pypi
numpy-base 1.22.3 py39h3b1a694_0
and:
>>> np.__version__
'1.22.3'
That said, it would still be nice to be able to pip install rdkit
so I think asking @greglandrum about that is a good idea.
@kuelumbus actually I think numpy
is a special case due to its complexity. For a simpler package like seaborn
there does not appear to be any conflict (although conda shows the wrong version) as there is only a single package name.
@kuelumbus actually I think
numpy
is a special case due to its complexity. For a simpler package likeseaborn
there does not appear to be any conflict (although conda shows the wrong version) as there is only a single package name.
Does this mean that even if RDKit would have the same name, it is possible to have a conda
and pip
version installed in one conda environment? (I assume conda installs packages to a different directory than pip
.)
Also, why do you have two rdkit versions installed in one conda environment (or how did this happen)? Is this a standard case conda users end up with?
Does this mean that even if RDKit would have the same name, it is possible to have a conda and pip version installed in one conda environment? (I assume conda installs packages to a different directory than pip.)
No, you'll just have one install. Conda and pip both install packages in .../site-packages/
, but they store their metadata separately so the version shown by conda|pip list
could be different even though they are pointing to the same source files.
Also, why do you have two rdkit versions installed in one conda environment (or how did this happen)? Is this a standard case conda users end up with?
That's a contrived case for illustration, but it could happen if users install packages that depend on rdkit from both sources.
Thanks for clarifying. I believe the only thing that I can do is to rename the package on PyPi to rdkit
. But it seems to me the best solution is to use either conda or pip and don't mix them.
That said, it would still be nice to be able to
pip install rdkit
so I think asking @greglandrum about that is a good idea.
@kuelumbus I am happy to do this; send me email and we can figure out how to do the transfer
Ha! I see you have already done so. :-)
Came here to suggest using rdkit
as distribution name for the wheel releases. I'm super happy to see it happening :)
Developers being able to do packaging depending simply on rdkit
right away set's them up for a future where the conda and pip installs are properly aware of each other:
As @skearnes was saying with conda and pip install to the same location in site-packages. This seems weird but if you consider the second install as an update it makes sense. The problem is that the tools might not be properly aware of the existing version.
From conda side there is a configuration that will fix this with the new distribution name: https://docs.conda.io/projects/conda/en/latest/user-guide/configuration/pip-interoperability.html
However most of the time the pip installs come after the conda, and here the problem is coming from rdkit not installing the package properly. It's missing distribution info. Long story short the problem is that after the conda install alone, pip list
doesn't show it.
I came across the issue with conda+pip before and commented on the situation in the conversation on an older PR that was about packaging: rdkit/rdkit#2690 (comment)
TL;DR
Sorry for the long message and back seat commenting. I just hope to make all aware that moving the pypi name is really great but only half the fix. Until the rdkit source install (conda) is not fixed, pip will continue to always overwrite.
I just uploaded the recent RDKit version to https://pypi.org/project/rdkit/. You should now be able to install RDKit using
pip install rdkit
I am planning to keep the repos rdkit-pypi
and rdkit
at PyPi in sync for some time but retire rdkit-pypi
in the future.
awesome. Thanks @kuelumbus
Thanks @kuelumbus and @greglandrum! I'll send a PR soon to update the RDKit installation docs.
Ah I see @kuelumbus already submitted a PR (rdkit/rdkit#5373); thanks!