Adopting "working" scheme for every run
pradyunsg opened this issue ยท 28 comments
Just carrying over my idea in #1056 (comment), for proper dedicated discussion.
AFAIK, there are 3 possible schemes for packages:
- "system" - for system/global packages (
--system
)- "user" - for packages installed in user space (
--user
)- "local" - for virtualenv packages (
--local
)pip enforces a "working" scheme on every run. Outside a virtualenv, the default working scheme would be "system" (It should really be "user", that's another issue #1668). Inside a virtualenv, the default working scheme should be "local". Passing
--system
or--user
or--local
overrides the working scheme.Only packages in the same scheme as the working scheme can be modified. By modifying, I mean installing or uninstalling a package. Trying to modify a package in a different scheme is not allowed and pip would print a message and error out.
So, modifying a package in system scheme with a "user" working scheme is not allowed. Nor is modifing a package in user scheme with a "system" working scheme. Niether are the other permutations with "local".
I think this results in a pretty simple behaviour model.
As @ncoghlan pointed out, this would need some logic to understand that user installs shadow system ones. Also, I would slightly change this to spell it like --scheme {global,user,venv}
because I like how this signifies exclusiveness of the behaviours better than plain flags do and is consistent with some internal configuration stuff.
Additionally, while doing this:
I would slightly change this to spell it like --scheme {global,user,venv}
Oh, and this would let you set a default scheme for yourself using the configuration toolchain.
In principle this sounds like a reasonable thing to do - but I'm afraid I've got some personal priorities at the moment that mean I don't really have time to think through the implications.
So count me as in favour in principle, but not really able to provide a detailed review at the moment, sorry.
Cool. Thanks! :)
Just noting that I briefly thought that --scheme
may not be a good name for this option, due to the potential collision with the concept of installation schemes in sysconfig
: https://docs.python.org/3/library/sysconfig.html#installation-paths
However, I subsequently realised that these uses are actually the same use case - the new pip
level option is just a helper to select the desired scheme without having to specify the exact platform appropriate scheme name as defined in sysconfig
.
Linking to #2418 since I somehow always forget there was an attempt at making --user
behaviour default.
concept of installation schemes in sysconfig
This is, actually, nice that the names here and there match. :)
This came up in #4809 where I suggested being able to pip list
only packges related to one scope (now it combines both system and user packages, making it impossible to say which package comes from where).
I've gone ahead and made a PR for this -- #4871.
Thoughts @dstufft, @xavfernandez?
Maybe Barry Warsaw (his name's on the Debian patch to pip; #1668 would be fixed as a part of that PR) should be pinged for this discussion?
How does --target
mode of operation relate to schemes? It feels like --target
essentially is another type of scheme and treating it as such would potentially solve many bugs pertaining to --target
option.
Like @piotr-dobrogost, I also think --target
needs to be a scheme on its own. This would solve my problem #5686 where packages installed via pip install --target
are not manageable at the moment.
I think that's a valid request.
However, I'd say we "promote" it to a scheme after the initial refactoring/functionality change needed to do this for existing install locations that are schemes.
Oh, and this would let you set a default scheme for yourself using the configuration toolchain.
Note that we'd have to be careful here, as the current default (if you don't specify anything) is "if you're in a virtualenv, use local
, otherwise use system
". But if a user wants to set a default of "if you're in a virtualenv, use local
, otherwise use user
" (which is likely the most common need), the config system won't help - it'll let them say "always user user
", but not make that conditional on whether they are in a virtualenv or not.
It's arguable that users can put user
in their config file, and explicitly use --scheme=local
on the command line when they want to install to a virtualenv, but I don't think that addresses the use case of people who simply want to say "default to user".
One possibility (given that local
only makes sense if you have a virtualenv active) is to allow the scheme to be local,user
, local,system
, local
, user
, or system
(where the cases with two entries mean "use local if you're in a virtualenv, otherwise fall back to the other option", and a base local
means "use local if you're in a virtualenv, otherwise error"). It feels a bit clunky to me, so I'm open to better spellings, but I prefer this idea to any solution that makes specifying something in a config file work differently than specifying it in an environment variable or on the command line.
I haven't been closely tracking the evolution of the working schemes design, but what if the user
and system
schemes were defined as being aliases for the local
scheme when an active virtualenv is detected, and there were separate force-user
and force-system
schemes to say "use the named scheme even when an active virtualenv is detected"?
Updated to add: Functionally, this is the same thing @pfmoore suggested, but the spellings are different:
local,user
-> justuser
local,system
-> justsystem
local
->venv-local
(error if no active venv detected)user
->force-user
system
->force-system
(I initially had some comments here about those names being semantic changes, but then I remembered that this option has never actually been released yet)
Yeah, that's a reasonable option too. Ultimately, it'll be about what people feel is the most "natural" formulation (which should have the shortest name) and I'm not really qualified to comment on that as I pretty much never use anything other than "local" myself.
One thing we do want to consider in terms of semantic changes is how the current (default and --user
) behaviour ends up being spelled under these proposals (mine: local,system
and user
respectively, Nick's: system
and force-user
).
It's also worth remembering that the original statement on this issue simply named the 3 schemes: system
, local
and user
. It proposed a way of forcing any one of them, but offered no way of naming the default behaviour (which is context dependent, based on whether you're in a venv). By moving to having names for various combinations, we're going beyond that original scope. The reason for this is that people have expressed a strong desire for the combination "local if in a venv, else user". But that's the only case that has been explicitly requested, so we should be careful not to over-generalise too heavily here.
Personally, I think my approach has the advantage of being an "obvious" generalisation (add a fallback if the choice of local isn't valid) while still supporting the local,user
use case. But Nick's (with the exception that I prefer "local" over the more verbose "venv-local") has simpler to understand names. The mathematician in me prefers my proposal, the end user in me prefers Nick's ๐
@pradyunsg suggested using global
instead of system
which makes sense. Actually I would go even further and would suggest interpreter
as the scope of installation in this case is interpreter-wide and there can be many interpreters installed in the system whereas global
suggests something unique in the scope of the system.
but what if the user and system schemes were defined as being aliases for the local scheme when an active virtualenv is detected, and there were separate force-user and force-system schemes to say "use the named scheme even when an active virtualenv is detected"?
The idea of using local scheme by default (and discarding options choosing any other scheme) when working in the context of virtualenv seems right. As to working in other than local scheme in the context of virtualenv; is such an option really needed? What's the use case? What if system/global/interpreter and user schemes where allowed only outside of virtualenv?
@pradyunsg suggested using global instead of system which makes sense. Actually I would go even further and would suggest interpreter as the scope of installation in this case is interpreter-wide and there can be many interpreters installed
I think 'interpreter' is liable to confusion, because environments typically have a Python interpreter inside them (and we recommend using path/to/env/bin/python -m pip
to explicitly install into an environment). So I'd take 'interpreter' to mean the environment, not a global installation.
This also relates to another question: I don't know a good general way to distinguish an 'environment' from a systemwide installation; they are both based on a installation prefix with a standard organisation of folders under that. For flit install
, I don't try to conceptually separate env vs global, but I pick whether to do a --user
install based on whether the library directory is writable. This means that a standard non-sudo install won't try to act systemwide, but a sudo install defaults to systemwide rather than a user install for root (which is probably not what you want).
I just had a chance to read through this after being pointed here. How about having in addition to the original scheme=global,user,local a second config option local_site=global,user? The latter does not necessarily need exposing to command-line as it's more of a system configuration option. It would only affect behaviour when local is used, otherwise it would be ignored.
I am really happy to see that someone is working on dealing with this issue that was hunting me since the dawns of time.
As a note, when implementing it please allow configuration of a sorted preference list so I can configure pip to install: in virtualenv if any, fallback to user, fallback to system (or other order).
Ideally I would like to see a behavior that works like this:
- install package in current virtualenv, if any
- try to install on system if you have permissions
- try to install on user
- fail if any attempt failed
Some distributions may setup pip such way that it would avoid overriding packages installed using their distro package manager (yum/dnf/apt-get/...). If I am not wrong Debian or Ubuntu did something cool where destination of distro installed packages does not match the pip one and python looks in both, avoiding the risk of install/uninstall conflicts.
One thing to keep in mind: please consider default behaviour very well because pip commands are often saved in scripts and files that the user may not be able to modify to make it work. We want to succeed without having to alter the codebase, if possible.
I personally think that fallback logic is too complicated. If you're operating under a badly configured distro, use virtualenv.
@nanonyme The fallback has nothing to do with what you call "badly" configured distro. Here is a very simple use case: you have a bash script that call "pip install foo", which is needed for testing you code. You want to make this script usable regardless if user is inside an virtualenv or not.
The script could be called by tox so it would be inside a virtualenv, it may be called by user outside a virtualenv or by user after the activated a virtualenv.
All these use cases are not only valid but also wild spread. At this moment it requires adding a lot of extra logic inside that bash script in order to detect virtualenv presence and decide which params to give to pip.
I gave the bash script just as an example but in practice users may not even have a script and only a configuration line that accepts a "command to run", something quite common on CI systems (see travis.ini). Those CI systems may run that code inside or outside a virtualenv.
If pip is not able to detect and make use of virtualenv, it make any usage of it much harder because the user would be forced to write wrappers around that code in order to make it work in various contexts. Writing these wrappers is not even possible in some cases (or just hard and ugly due to the need to cope with multiple levels or quoting in order to use bash to implement that missing logic).
Basically with above you have
- Explicit global, ignore virtualenv
- Explicit user, ignore virtualenv
- Explicit local, local set to global, virtualenv with fallback to global
- Explicit local, local set to user, virtualenv with fallback to user
The fallback sequences are far simpler this way
Not sure if I missed it, how would this interact with the set of packages available for import in a setup.py
-based install, or a PEP 517 install using --no-build-isolation
?
Also, is the assumption that for determining whether a dependency is already installed we'd have some of these schemes "inherit" the packages from the packages that would be available in another? Like
user
considers packages insystem
local
considers packages inuser
andsystem
only if system site packages is enabledsystem
doesn't consider any otherstarget
doesn't consider any others (or it may be configurable -I think we've seen use cases going both ways)
No changes on that front. This would only affect how a package is installed / unpacked, not how it's built.
Things that are currently importable at build time, would stay importable -- basically anything that's on sys.path when running.
It looks like @takluyver in #7164 (comment) has a nice plan for stopping some of the damage of defaulting to write to system wide packages. #7002 That should hopefully stop much of the damage experienced by newbies. Thank you, thankyou, thanksssss to all involved! ๐๐
(Re?)iterating that this change is still relevant in a post #7002 world, since this is an internal refactoring-related issue in pip.
I'd expect us to start by making refactors to decouple the scheme from the various parts of our codebase (like @chrahunt has started!) and once that's completely/fairly done; we'd start working on the user-facing changes.
I agree this will hopefully improve the current inconsistent behavior. For example using Python 3.11.3 venv with pip 22.3.1 on Linux, I am able to install a package with the --user option, but if I attempt to uninstall the same package using the same venv/pip, it refuses indicating package is "outside environment" and "Can't uninstall 'mypackage'. No files were found to uninstall." I would expect the uninstall to work given the install was permitted.
If you have a sudo account, use this command.
sudo dnf remove python3-requests
For example, Amazon Linux 2023 have this issue.