rapidfuzz/python-Levenshtein

PyPI package name conflict

Closed this issue · 9 comments

This repository has the same PyPI name as this one https://github.com/ztane/python-Levenshtein/blob/master/setup.py#L18 and everyone using this package https://github.com/seatgeek/thefuzz/blob/master/setup.py#L26 installed yours after an update instead of the intended one.
Please rename it and remove the newer versions from PyPI

I am the new maintainer of the project since Antti Haapala did not have the time to continue maintaining the package. Do you have any specific issue with the new versions of the package?

This package is using yours and since there is no code, is failing.

I am pretty sure you are running into: seatgeek/thefuzz#35 which is solved by uninstalling python-Levenshtein + Levenshtein and then simply installing it again.

Did this solve your issue?

I ended up fixing the version to python-Levenshtein==0.12.2 until this is solved properly

And what would solve this "properly" in you eyes? I performed the following

  • historically the package was called python-Levenshtein on pypi, but actually installed a package called Levenshtein
  • I created a fork called Levenshtein both on pypi and locally
  • I am the new maintainer of python-Levenshtein and merged the two packages by making python-Levenshtein an empty package depending on the Levenshtein package

I did not realize that apparently pip is not able to handle this upgrade. So upgrading python-Levenshtein==0.12.2 simply leads to a broken installation. I think this should be classified as a bug in pip. However there is really nothing I can do about it. For users this is resolved by uninstalling the package and installing it again.

Then the problem resides on all packages that depends on this one for not fixing the version. I will create a PR against thefuzz to make sure it uses Levenshtein instead of python-Levenshtein.

Having a complex server structure and deployment pipelines, is not that easy to "uninstall and install" a package. We normally upgrade (or downgrade if something goes wrong).

I opened an issue in the pip repository yesterday: pypa/pip#11563, since I really think that package upgrades should not be able to break installations. I do not see any reason why:

python3.10 -m pip install python-Levenshtein==0.12.2
python3.10 -m pip install -U python-Levenshtein

should behave any different than

python3.10 -m pip install python-Levenshtein

Then the problem resides on all packages that depends on this one for not fixing the version. I will create a PR against thefuzz to make sure it uses Levenshtein instead of python-Levenshtein.

The problem is that the overall situation with Levenshtein/python-Levenshtein was even more broken. Installing them side by side would break each others installation:

python3.10 -m pip install python-Levenshtein==0.12.2
python3.10 -m pip install Levenshtein

is broken, since it will install both packages into the same directory. This is resolved in newer version, but apparently the upgrade is broken for the same reasons.

Having a complex server structure and deployment pipelines, is not that easy to "uninstall and install" a package. We normally upgrade (or downgrade if something goes wrong).

Hm I do not work in this field. I agree that upgrade should just work. However I find it surprising, that afaict this would mean that when a project has a broken upgrading process for one version upgrade, your stuck with the old version indefinitely, since you will never be able to perform the upgrade without "uninstall and install".

Regarding Library usage I would recommend most projects to use rapidfuzz instead of python-Levenshtein/Levenshtein/thefuzz/fuzzywuzzy:

  • largely API compatible to thefuzz/fuzzywuzzy -> for most projects it is enough to change the import
  • at least when using thefuzz[speedup] rapidfuzz is already a recursive dependency anyway -> reduces dependencies
  • more open license (GPL vs MIT) which is an issue for a large amount of projects using these libraries
  • significantly faster
  • similar to thefuzz/fuzzywuzzy provides a pure Python fallback, but different than them always produces the same results in both implementations

@semicolom this is a known issue in pip, which is hard to solve with the current architecture of pip. See pypa/pip#8509 for reference. Similar to my recommendation above the maintainers of ansible told it's users to uninstall and then reinstall the package again: ansible/ansible#70529. While I understand this makes the upgrade way more painful for users, there is nothing I can do about an issue in pip.