pypa/pip

Improve error message for wheels with non-compliant versions

bkurtz opened this issue Β· 20 comments

What did you want to do?
Our organizational development standards prefer that we include a git/vcs hash in most package version numbers. In other environments, this has typically not been problematic, however with python, including a hash in the version number is not PEP440 compliant.

Previous versions of pip have difficulty applying logic (e.g. ~=, >) to these version numbers, but as long as we pin a specific package version, installations would work.

pip install --extra-index-url https://our.local.package.source/artifactory/simple our-package==0.3.0.f038176.11

(apologies, these are internal code and I can't provide a public URL)

Output

ERROR: Exception:
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/pip/_vendor/resolvelib/resolvers.py", line 171, in _merge_into_criterion
    crit = self.state.criteria[name]
KeyError: 'our-package'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/pip/_internal/cli/base_command.py", line 210, in _main
    status = self.run(options, args)
  File "/usr/local/lib/python3.6/site-packages/pip/_internal/cli/req_command.py", line 180, in wrapper
    return func(self, options, args)
  File "/usr/local/lib/python3.6/site-packages/pip/_internal/commands/install.py", line 319, in run
    reqs, check_supported_wheels=not options.target_dir
  File "/usr/local/lib/python3.6/site-packages/pip/_internal/resolution/resolvelib/resolver.py", line 122, in resolve
    requirements, max_rounds=try_to_avoid_resolution_too_deep,
  File "/usr/local/lib/python3.6/site-packages/pip/_vendor/resolvelib/resolvers.py", line 445, in resolve
    state = resolution.resolve(requirements, max_rounds=max_rounds)
  File "/usr/local/lib/python3.6/site-packages/pip/_vendor/resolvelib/resolvers.py", line 310, in resolve
    name, crit = self._merge_into_criterion(r, parent=None)
  File "/usr/local/lib/python3.6/site-packages/pip/_vendor/resolvelib/resolvers.py", line 173, in _merge_into_criterion
    crit = Criterion.from_requirement(self._p, requirement, parent)
  File "/usr/local/lib/python3.6/site-packages/pip/_vendor/resolvelib/resolvers.py", line 82, in from_requirement
    if not cands:
  File "/usr/local/lib/python3.6/site-packages/pip/_vendor/resolvelib/structs.py", line 124, in __bool__
    return bool(self._sequence)
  File "/usr/local/lib/python3.6/site-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 96, in __bool__
    return any(self)
  File "/usr/local/lib/python3.6/site-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 20, in _deduplicated_by_version
    for candidate in candidates:
  File "/usr/local/lib/python3.6/site-packages/pip/_internal/resolution/resolvelib/factory.py", line 203, in iter_index_candidates
    version=ican.version,
  File "/usr/local/lib/python3.6/site-packages/pip/_internal/resolution/resolvelib/factory.py", line 145, in _make_candidate_from_link
    link, template, factory=self, name=name, version=version,
  File "/usr/local/lib/python3.6/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 290, in __init__
    wheel_version = Version(wheel.version)
  File "/usr/local/lib/python3.6/site-packages/pip/_vendor/packaging/version.py", line 298, in __init__
    raise InvalidVersion("Invalid version: '{0}'".format(version))
pip._vendor.packaging.version.InvalidVersion: Invalid version: '0.3.0.f038176.11'

It appears that when installing from wheel, the package version number is parsed using a regex that only matches PEP 440 compliant version numbers, which fails when given a version number containing a git hash.

Additional information
We have found that installing the affected packages from source (by adding --no-binary our-package to the pip command line) acts as a workaround. It is unclear to me why the source installation does not enforce the same version number checks that the binary installation does.

FWIW you can include a git hash in the version using local version numbers, so something like 0.3.0+f038176.11.

Ah, that's good to know and I'd somehow missed that last time I looked through the specs. I'll see if that might be a better long-term solution to our needs (it would be nice to allow other comparison operators besides strict equality).

Still seems like it's a bug that versioning isn't treated the same between binary/source installs.

Yea it should probably treat them the same between binary and source installs for sure.

One thing to pay attention to is that local versions have some uh "interesting" semantics. If you're just generally using strict equality OR you're using the local version to communicate extra information it'll be totally fine. But it might be surprising if you're trying to use local versions as part of the meaningful part of the version.

--no-binary circumvents all the metadata validation checks today since it's not installing via wheels, but that "loophole" will be plugged, when we drop setup.py install.

jre21 commented

When was it decided that pip would only support PEP440 compliant version strings, and why wasn't there a deprecation cycle for this? Given that the pip 10.0.0b1 changelog includes

  • Support the installation of wheels with non-PEP 440 version in their filenames. (#4169)

this seems like exactly the sort of change in policy that needs to be communicated to users ahead of time rather than introduced into a release with no fanfare whatsoever.

When was it decided that pip would only support PEP440 compliant version strings, and why wasn't there a deprecation cycle for this?

This is a part of the changes being made as part of the new dependency resolver. The new dependency resolver is stricter across the board with metadata generated by projects. This behaviour has been a part of pip's new resolver which has been in beta for the past few months, and has had several rounds of publicity and communication telling users to test it since we are making significant changes (such as this one!) to it. As noted multiple times across various communication channels, we have a long list of changes toward being stricter about what metadata pip accepts from packages.

There's more discussion about this specific stricter behaviour in another recently-filed issue: #9183 (comment)

So, that sounds like we're okay for the short term using source installs, but the writing is on the wall and we need to move toward PEP-440 compliant versioning for all our packages.

WRT warning or lack thereof about this particular change, you all have done a generally phenomenal job of publicizing the need for additional testing of the new resolver. I guess where I went astray was that I'd been thinking of the "new resolver" as one set of potentially breaking changes, and so once I'd tested against those back in 20.2.3 or whatever a couple of months ago, I was assuming it was all bug fixes the rest of the way out, which seems not to have been the case. Ah well!

In terms of this ticket, it seems like:

  • the existing behavior with strict version checking of wheels is intentional (not a bug)
  • the existing behavior of non-strict checking for source packages is unintentional (arguably a bug) but has plans (elsewhere?) to be addressed by 21.0.

Is there any reason to keep this ticket open, or shall I go ahead and close?

Let’s keep this open until at least we provide a better error message for the failure case.

Can't we make this optional rather than raising an exception? Unfortunately the way we version packages is not conforming with this and we don't intend to change it.

@jakab922 You have to switch at some point. By not conforming, the wheels you generate are technically invalid. The resolver needs version semantics to resolver, and the best it can do is to omit the warning and skip the wheel entirely, which is likely not what you’d want.

You have to switch at some point. By not conforming, the wheels you generate are technically invalid.

This is true, however, it's a very bad day to wake up to your internal packages that used to resolve just fine to be completely broken with a red herring error message (especially that pip always encourages users to update to the latest version). This is user-hostile behavior and part of the reason why python packaging has such bad rep. Not trying to point fingers and I appreciate striving to conform better with the relevant PEPs. However, I think pip needs to be much more conservative with the breaking changes. There is a huge body of packages with "technically invalid" metadata, not just in private indexes but also on PyPI. From a users point of view, it's strictly better if one can install packages versus being conforming.

FWIW you can include a git hash in the version using local version numbers, so something like 0.3.0+f038176.11.

While this does work with pip, we have some internal packages that also have NPM package counterparts and using + in the version does not seem to work correctly with NPM. We were using a - before because it worked with both package managers. Now I've had to waste a bunch of time fixing build pipelines.

Sorry, but I also came to point out that such breaking changes must have a grace period with appropriate warning before you start breaking things for people.
It's quite clear that you didn't think much about those this change could cause problems for. Should have rolled back this change already. Poorly done. πŸ‘Ž

There was a grace period for this. Quoting myself from earlier in this thread:

This is a part of the changes being made as part of the new dependency resolver. The new dependency resolver is stricter across the board with metadata generated by projects. This behaviour has been a part of pip's new resolver which has been in beta for the past few months, and has had several rounds of publicity and communication telling users to test it since we are making significant changes (such as this one!) to it. As noted multiple times across various communication channels, we have a long list of changes toward being stricter about what metadata pip accepts from packages.

This is the absolute wrong place for this kind of check. It should be done BEFORE the package is even built, as part of the packaging process.

Currently, there exist tons of failing-to-comply-to-PEP-440 packages already released in the wild (the bulk of them in internal/private repos). Saying they are no longer allowed to be installed because of a version check in the installer puts the burden on the end user/developer trying to make use of the package. As a package user, I can't do anything about a bad version; the package's developer is the only one who can fix it.

Unless it's absolutely necessary for the functioning of the dependency resolution system, this feature does not belong here in pip. It belongs at the start of the packaging process.

Unless it's absolutely necessary for the functioning of the dependency resolution system, this feature does not belong here in pip.

It is, for things to work with a proper resolver.


The new dependency resolver's rollout (which is what contained this change) has had the most communication that I've seen in any change in Python packaging ecosystem; especially when compared to any of the other changes pip has made in it's lifetime. We actively reached out to publicised this release across many avenues and communication channels. This also included reaching out to PSF sponsors and asking them to test the beta as well.

I want to encourage folks to think about how they could've avoided being blindsided by this change. If you didn't see that there was a major change coming and test before the change happened, do take the time to see if any of pip's communication channels (discuss.python.org, Python-announce, pypi-announce, being a PSF sponsor, etc) would've helped you hear about this change before it happened. Honestly, I'm not sure what more we could have done to make it so that folks adapt for this change in time, and I'm not sure what else we can do differently when making breaking changes.

(no, the answer is not to never make breaking changes; if you want that, please use something else or insulate your pipeline from changes in pip)

@pradyunsg to offer constructive feedback, all the places you mention are for people actively participating in python or pip development. I suspect that's the minority of your users. The best place for this heads-up would have been in the stdout message where pip encourages users to upgrade.

But I'd also like to echo what was said in the thread before: this is placing a burden on the end user who can't fix the package to be compliant. And I'd also argue that never breaking should not be ruled out. There is a huge body of packages with non-compliant metadata, not only in private registries. Removing the ability to install those makes pip fundamentally less useful.

@sztomi Hi - heads-up that if you depend on Python packaging you should subscribe to pypi-announce, and if you depend on Python software you should consider subscribing to Python-announce. They are relatively low-traffic lists that are meant for end users.

z3by commented

Using --use-deprecated=legacy-resolver solved it for now.

dcupp commented

Enforcing PEP 440 even for internal packages is unfortunate. Since PEP 440's scheme was designed for open source packages being published on public servers it naturally does not consider some the use cases we are rely on internally (we tried to fit them into PEP 400 but were unsuccessful). Thankfully --use-deprecated=legacy-resolver works for now.