Always remove extras in compiled files
FlorentJeannot opened this issue Β· 45 comments
Issue
I am opening this issue to discuss if the extras should be kept or removed in the compiled files.
@atugushev made a good summary of the situation:
Currently, we have direct references without extras and pinned packages with extras in requirements.txt, which looks wrong and should be synced in some single way.
We would like to get feedback from the community, do you think we should keep them or remove them and why?
My opinion is that we should remove them, since pip-compile
is already listing all packages needed for a project, it seems redundant to me to specify it twice (one time in the extra and one time as a top-level dependency). Also, it's in theory possible to install more packages than those specified in the requirements.txt via the extras. In my opinion the generated .txt file of pip-compile
should act as a lock file. The only advantage I can see is that we can easily inspect which dependencies are using extras.
@AndydeCleyre said that the order of installation could matter in some cases such as GDAL
which requires numpy
to be installed first. I checked if having the extra (gdal[numpy]
) in the .txt file was making a difference, and I found that it was not working. You can read this gist if you want to have a look at the tests I've done (there's a conclusion at the end if you don't want to read it all).
Links
Some links about the discussion around this:
- PR to remove extras
- PR to add extras in direct references
- PR about the direct reference without extras
Samples from dependency management tools
The goal is to show you the output of different management tools when the project specifies extras. This may help you make a decision on the issue.
For each tool, I installed gdal[numpy]==3.2.2
.
pip
Command: pip freeze > requirements.txt
The file contains:
GDAL==3.2.2
numpy==1.22.3
pip-tools
Command: pip-compile
The file contains:
gdal[numpy]==3.2.2
# via -r requirements.in
numpy==1.22.3
# via gdal
Pipenv
Command: pipenv lock -r > requirements.txt
The file contains:
-i https://pypi.org/simple
gdal[numpy]==3.2.2
numpy==1.22.3
Poetry
Command: poetry export -o requirements.txt
The file contains:
gdal==3.2.2
numpy==1.22.3; python_version >= "3.8"
Pros and cons
I'll try to collect all your feedbacks to update these lists.
Reasons to keep the extras:
- We can clearly see which dependencies are using extras (FlorentJeannot)
- #1613 (comment)
Reasons to remove the extras:
- In theory, it's possible that
pip-sync
orpip
could install more packages than what is listed in a .txt file because of extras. I think the output ofpip-compile
should act as a lock file, so it should only install what's specified in the .txt file. (FlorentJeannot) - It's redundant. Packages specified in the extras are also in the top-level dependencies. (FlorentJeannot)
- #1613 (comment)
@FlorentJeannot You can remove extras from the output file with the --strip-extras
option. This was added in version 6.2.0 to allow creating constraint-compatible requirements files. For example, I have my requirements-dev.in
headed with the line -c requirements.txt
, so that development requirements don't try to have dependencies that are incompatible with the compiled dependencies for production.
https://github.com/jazzband/pip-tools/releases/tag/6.2.0
This is an important use-case for me, so I wouldn't personally mind having this become the default, but the output you're looking for is already possible to obtain.
- I think it is possible for a package to change its installation behavior based on whether an optional dependency, implied via an extra-group, is already installed. e.g.
coolproj[alternate-file-layout]
- @LouisAumaitre might want to comment here; the current inclusion of extras makes it invalid as a constraints file, when using the backtracking resolver
I'm going to copy my comment from #1539 here, which sums up my current thoughts on this:
As long as we are offering --strip-extras
and not offering its negative, I'd guess that the default output line format would include extras (where this PR currently strips them).
Now that the constraints syntax is stricter (with the backtracking resolver), I expect it will be much more common for folks to need files without the extras. So I would support a separate PR to do that by default, while offering a new option to include them, e.g. --no-strip-extras
/--include-extras
.
I think it is possible for a package to change its installation behavior based on whether an optional dependency, implied via an extra-group, is already installed. e.g. coolproj[alternate-file-layout]
Oooh, this one is rough. If the order of installation of packages matters, that is a real challenge. One that I'd rather nobody ever have to think about. I don't think it'd ever be possible to make that behavior intuitive.
@ryanhiebert I agree.
I tried to install GDAL here which depends on numpy. It was really painful to have it working with pip
, and I don't think there is an easy solution to reproduce that with pip-tools
.
@FlorentJeannot from this link from your GDAL gist discussion, I'd say it might be best for us to ignore the ordering thing. It's not intended behavior of setuptools, so we wouldn't want to encourage that type of bad behavior from packages.
What is your motivation behind wanting to create a flag that preserves the current behavior? I'd be fine just changing the behavior entirely, but my perspective may not be seeing an important constraint.
@ryanhiebert I just wanted to emphasize your last message with this example (that it's a real challenge).
I first suggested to remove the extras in the "compiled" files, because I didn't see the point to have them, and since pip freeze
is still not doing it, then I was thinking that it's just not needed.
Then @AndydeCleyre told me that there is an order of installation when we declare an extra to a package. GDAL was mentioned in another thread about this installation order, so I wanted to try it out by myself to see what happens when we try to install GDAL with different package management tools.
Now that I've tried it, my opinion is that packages which depend on an order of installation is something tricky (and it also seems to be a rare thing). The way @AndydeCleyre made it work is not trivial and it's not working for me with pip>=22
because the installation order with extras in this version has changed.
So I still think we should not have extras by default in the "compiled" files. We could have an --include-extras
, but why would need that? The extras in the .in
files seem enough in my opinion.
gdal turned out to be a false example here, because they are trying to control build time behavior based on the installed package set, whereas the extras only guarantee installation order, not whether extra-specified deps are installed at build time.
Agreed with both of you, @FlorentJeannot and @AndydeCleyre . So far as I can see, I think it would be fine to remove extras be the only behavior, and deprecate the --strip-extras
flag entirely.
@AndydeCleyre , my question about motivation was intended for you (though I wasn't keeping good track of who I was responding to). Is there some important constraint, other than install order (which we've shown is about a concern that setuptools says should not be considered), that suggests that we should keep the ability to include extras somehow that I'm not seeing?
And you answered that question on the PR linked earlier. I'm also fine with keeping some flag and just changing the default behavior.
@FlorentJeannot thanks a lot for this awesome analysis and detailed summary!
I'm in favor of stripping extras in requirements.txt
:
- once
requirements.in
is compiled there is (should be) no difference in installation result whether therequirements.txt
was with or without extras - requirements.txt without extras can be used as a constraint file in the layered workflow. Currently, users have to run
pip-compile --strip-extras
- fewer bytes and less distracting info in
requirements.txt
requirements.txt
should look more likepip freeze
, where there are no extras
@atugushev I agree we should start stripping extras by default, but
... once requirements.in is compiled there is (should be) no difference in installation result whether the requirements.txt was with or without extras
What about my comment here?
What about my comment here?
@AndydeCleyre the link does not show the comment. Could you quote here?
@AndydeCleyre the link does not show the comment. Could you quote here?
I'm not saying it's good or common practice, but my understanding is that extras can be used to enforce installation order, and the set of packages already installed can be used by setup.py
's install
to follow different code paths accordingly.
A hypothetical example:
- we have a package,
mypackage
- it defines an extra,
interactive
- the same package author also provides a kind of dummy package,
mypackage-interactive-installation
, required bymypackages
'sinteractive
extra - during
mypackage
'sinstall
, if and only ifmypackage-interactive-installation
is already installed, the user is prompted to make some choices which will affect the installation
@AndydeCleyre thanks! That looks like a shoot in the foot π While I understand there are setup.py
hackers (historically) in the wild, however, we would never satisfy their needs due to the dynamic nature of setup.py
builds. As far as I see the Python world slowly moving towards static metadata (hello setup.cfg
/pyproject.toml
), I don't see any reason why we shouldn't encourage that.
I agree with @atugushev
It sounds to me like we have a rough consensus that that changing the default to strip extras is likely appropriate. I think what it needs now is someone to take a stab at making it happen with a pull request. Whether the existing behavior remains is up to the implementer and those that review the pull request. It is possible that having the backward compatibility option (that I'd prefer calling --include-extras
if present) could make it easier to agree to change the default when the pull request is created, but its also true that I can't think of a use-case where I'd ever want to use it rather than try to fix what I'd likely consider a broken library.
@ryanhiebert PR already exists (#1608) but it does not include the --include-extras
that you suggest, for now.
Oh nice, thank you for letting me see that. How do we draw out some consensus of action at this point?
Proposal π:
- Release 1:
- Strip extras by default
- Deprecate
--strip-extras
- Add flag for including extras
- Release 2:
- Remove
--strip-extras
- Remove
Proposal π:
- Release 1:
- Strip extras always
- Deprecate
--strip-extras
- Release 2:
- Remove
--strip-extras
- Remove
Also keep in mind that whether we want them to or not, there are definitely teams out there using their own parsers on the output of pip-compile
, for their particular build processes.
I'll wait for a comment from @atugushev before making any change in my PR.
I'm still in favor of "always stripping extras" and vote for 2nd proposal. My motivation:
- pip-tools resolves all dependencies, having extras nothing changes (except in rare casees)
- consistency with
pip freeze
- compatibility with constraints files
pip install -c constraints.txt
- less code
The only difference between options 1 and 2 suggested by @AndydeCleyre is whether to add --include-extras
. It seems to me that there are use cases where it would be useful to include the extras, and in those cases completely removing the ability to keep extras would force using alternative solutions.
Unless the maintenance burden is unusually high, I don't see why you'd want to remove this existing feature.
FWIW, if pip-compile
did always remove extras from the compiled requirements, then I wouldn't have ended up filing pip/#11599.
But I think preserving the extras in the compiled requirements is useful for understanding how the dependency tree was calculated, as I noted over in the pip
issue.
Why pip-tools should strip extras from pip devs - pypa/pip#11599 (comment).
Since pip-compile --resolver backtracking
always strips extra and will be the default resolver in 7.0.0 we can make the legacy resolver consistent with backtracking in this version and deprecate --strip-extras
(2nd proposal). Is that alright?
Or leave legacy resolver as is and just deprecate the option.
Using --resolver=backtracking
does not currently strip extras. Again: I filed that bug against pip
after switching, per the recommendation in the newly-added warning in pip-compile
, to --resolver=backtracking
.
Why pip-tools should strip extras from pip devs - pypa/pip#11599 (comment).
AFAIU, that comment is specifically giving the rationale for excluding extras from valid constraints files, which is not the only type of file generated by pip-tools.
AFAIU, that comment is specifically giving the rationale for excluding extras from valid constraints files
That's correct. I'd like to mention that under the hood pip-tools passes requirements.txt (if exists) as -Ρ requirements.txt
to pip's resolver:
pip-tools/piptools/resolver.py
Lines 525 to 526 in 6870602
and strips extras:
pip-tools/piptools/resolver.py
Line 538 in 6870602
which is not the only type of file generated by pip-tools.
So essentially requirements.txt
is a constraint file and the pip-tools' resolver already follows the comment pypa/pip#11599 (comment), except it injects extras to requirements.txt back again inconsistently.
I am going to rename this ticket to "Removal of strip-extras option as being implicit", so we can start implementing it. If I read correctly most of the people agreed on going into that direction, especially as keeping them is a serious maintenance burden and a can of worms.
Out of curiosity, has this been earmarked for a future pip-tools
release? 7.0? Is it blocked by #1755?
I once again find myself wanting this to be the default. I think there's consensus on it, from what I can tell, so we may only be waiting on a pull request to manage the --strip-extras
and --include-extras` options and give a deprecation notice when neither is given explicitly, so that whenever we're ready to do another major release it will be ready to switch the default.
I don't think that it is blocked on anything, @acompa , so this really may be just a motivated pull request away. Are you the person to do it? Maybe I will be? We'll see who among those who care about this gets around to it first. My expectation is that it would not be a significant undertaking.
Given that the backtracking resolver always strips extras, and it's been the default since version 7.0.0
, it makes sense to deprecate the --strip-extras
option and update the code accordingly to remove extras from requirements.txt
in the next minor release.
Is the plan to allow folks to include extras if they want to, or remove that possible workflow entirely?
Is the plan to allow folks to include extras if they want to, or remove that possible workflow entirely?
I suggest that we remove the workflow entirely, as was suggested and voted for in the second proposal.
Given that the backtracking resolver always strips extras, and it's been the default since version
@atugushev is this right? Iβm on version 7, and I ended up back here because I got an extra in my requirements file.
Just checked, and I don't believe that strip extras is the default, even on >7. Notice the pyjwt[crypto]
dependency, which has the extra.
$ pip-compile --version
pip-compile, version 7.2.0
$ echo django-allauth > sample.in
$ pip-compile sample.in
#
# This file is autogenerated by pip-compile with Python 3.10
# by the following command:
#
# pip-compile sample.in
#
asgiref==3.7.2
# via django
certifi==2023.7.22
# via requests
cffi==1.15.1
# via cryptography
charset-normalizer==3.2.0
# via requests
cryptography==41.0.3
# via pyjwt
defusedxml==0.7.1
# via python3-openid
django==4.2.4
# via django-allauth
django-allauth==0.54.0
# via -r sample.in
idna==3.4
# via requests
oauthlib==3.2.2
# via requests-oauthlib
pycparser==2.21
# via cffi
pyjwt[crypto]==2.8.0
# via django-allauth
python3-openid==3.2.0
# via django-allauth
requests==2.31.0
# via
# django-allauth
# requests-oauthlib
requests-oauthlib==1.3.1
# via django-allauth
sqlparse==0.4.4
# via django
typing-extensions==4.7.1
# via asgiref
urllib3==2.0.4
# via requests
We've seen a couple people chime in since that original 2-option proposal that they'd like to be able to keep the extras in the output. I think we're settled that strip-extras
should be the default, but unless there are implementation reasons that it makes it difficult to keep both, I think it wisest that we not pull the rug out from under other users that may wish to have the extras in some cases.
That said, rather than having extras in the specified dependencies, I think they would better fit in the compiled requirements file that they were included in the via
comment lines. We don't currently do that, it would meet some of the desire to have extras that has been expressed, and I don't anticipate anyone objecting to such a change. If we got that, it might lessen objection to removing the ability to have the extras in the pip-significant output entirely.
is this right? Iβm on version 7, and I ended up back here because I got an extra in my requirements file
I mad a mistake, thatβs my bad. Thanks for providing the reproducer. Much appreciated.
The original issue was that pip-compile includes extras in regular dependencies, but doesnβt do so for direct references (aka <name> @ <ur>l
) in output file. Then pip released an update where reworked constraints files format by forbidding extras there. In response, pip-tools added --strip-extras
option to address this.
We've seen a couple people chime in since that original 2-option proposal that they'd like to be able to keep the extras in the output. I think we're settled that strip-extras should be the default, but unless there are implementation reasons that it makes it difficult to keep both, I think it wisest that we not pull the rug out from under other users that may wish to have the extras in some cases.
Making strip-extras=true
the default makes sense because, in a layered workflow, we pass requirements as constraint files (via -c <file>
) and pip forbids extras there. So, if we want a smooth transition, we should:
- Introduce
--no-strip-extras
to satisfy those who desire extras in the output file. - Add extras to direct reference dependencies (merge this: #1582) to sync the behaviour.
- Show a warning that pip-tools will switch the default to
strip-extras=true
if users haven't explicitly passed the--strip-extras
or--no-strip-extras
options. This way, users will be aware of and can adapt to future changes.
Feel free to submit a PR. You'll have my support with the review.
That said, rather than having extras in the specified dependencies, I think they would better fit in the compiled requirements file that they were included in the
via
comment lines. We don't currently do that, it would meet some of the desire to have extras that has been expressed, and I don't anticipate anyone objecting to such a change. If we got that, it might lessen objection to removing the ability to have the extras in the pip-significant output entirely.
Yes, pip-compile have never supported extras in via
annotations, and this seems like a separate enhancement issue. If someone wants to jump in and add support, please feel free to open a separate issue and submit a PR. I'd be glad to provide support with the review.
Again, my apologies for making incorrect assumptions. I'm going to close my PR #1953 since it's no longer relevant.
Yes, pip-compile have never supported extras in via annotations, and this seems like a separate enhancement issue. If someone wants to jump in and add support, please feel free to open a separate issue
Is this #1577?
Is this #1577?
Great, it already exists! Thanks @AndydeCleyre ππ»
I haven't seen this elsewhere, so I'll ask here:
How does removing extras from compiled files interact with --generate-hashes
and then using --require-hashes
in pip install?
I haven't seen this elsewhere, so I'll ask here:
How does removing extras from compiled files interact with
--generate-hashes
and then using--require-hashes
in pip install?
Works great in my experience. Itβs important to also use pip install --no-deps
, but that isnβt unique to these flags.
That said, rather than having extras in the specified dependencies, I think they would better fit in the compiled requirements file that they were included in the via comment lines.
I'd like to second this, as it will keep the semantic reasoning behind the extras at least somewhat intact in the generated files.