pypa/pip

Upgrading pip fails on Windows when install path is too long

zooba opened this issue ยท 39 comments

zooba commented

Received the following log on microsoft/PTVS#782.

Virtual environment is being created at 'C:\Users\TrevorSullivan\Source\Repos\PythonApplication11\PythonApplication11\env'
Virtual environment was successfully created at 'C:\Users\TrevorSullivan\Source\Repos\PythonApplication11\PythonApplication11\env'
----- Installing 'pip' -----
You are using pip version 6.0.8, however version 7.1.2 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
Collecting pip from https://pypi.python.org/packages/py2.py3/p/pip/pip-7.1.2-py2.py3-none-any.whl#md5=5ff9fec0be479e4e36df467556deed4d
  Using cached pip-7.1.2-py2.py3-none-any.whl
Installing collected packages: pip
  Found existing installation: pip 6.0.8
    Uninstalling pip-6.0.8:
      Exception:
      Traceback (most recent call last):
        File "C:\Python34\lib\shutil.py", line 523, in move
          os.rename(src, real_dst)
      FileNotFoundError: [WinError 3] The system cannot find the path specified: 'c:\\users\\trevorsullivan\\source\\repos\\pythonapplication11\\pythonapplication11\\env\\lib\\site-packages\\pip\\_vendor\\requests\\packages\\urllib3\\packages\\ssl_match_hostname\\__pycache__\\_implementation.cpython-34.pyc' -> 'C:\\Users\\TREVOR~1\\AppData\\Local\\Temp\\pip-h64zdfhc-uninstall\\users\\trevorsullivan\\source\\repos\\pythonapplication11\\pythonapplication11\\env\\lib\\site-packages\\pip\\_vendor\\requests\\packages\\urllib3\\packages\\ssl_match_hostname\\__pycache__\\_implementation.cpython-34.pyc'

      During handling of the above exception, another exception occurred:

      Traceback (most recent call last):
        File "C:\Users\TrevorSullivan\Source\Repos\PythonApplication11\PythonApplication11\env\lib\site-packages\pip\basecommand.py", line 232, in main
          status = self.run(options, args)
        File "C:\Users\TrevorSullivan\Source\Repos\PythonApplication11\PythonApplication11\env\lib\site-packages\pip\commands\install.py", line 347, in run
          root=options.root_path,
        File "C:\Users\TrevorSullivan\Source\Repos\PythonApplication11\PythonApplication11\env\lib\site-packages\pip\req\req_set.py", line 543, in install
          requirement.uninstall(auto_confirm=True)
        File "C:\Users\TrevorSullivan\Source\Repos\PythonApplication11\PythonApplication11\env\lib\site-packages\pip\req\req_install.py", line 667, in uninstall
          paths_to_remove.remove(auto_confirm)
        File "C:\Users\TrevorSullivan\Source\Repos\PythonApplication11\PythonApplication11\env\lib\site-packages\pip\req\req_uninstall.py", line 126, in remove
          renames(path, new_path)
        File "C:\Users\TrevorSullivan\Source\Repos\PythonApplication11\PythonApplication11\env\lib\site-packages\pip\utils\__init__.py", line 316, in renames
          shutil.move(old, new)
        File "C:\Python34\lib\shutil.py", line 535, in move
          copy2(src, real_dst)
        File "C:\Python34\lib\shutil.py", line 245, in copy2
          copyfile(src, dst, follow_symlinks=follow_symlinks)
        File "C:\Python34\lib\shutil.py", line 109, in copyfile
          with open(dst, 'wb') as fdst:
      FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\TREVOR~1\\AppData\\Local\\Temp\\pip-h64zdfhc-uninstall\\users\\trevorsullivan\\source\\repos\\pythonapplication11\\pythonapplication11\\env\\lib\\site-packages\\pip\\_vendor\\requests\\packages\\urllib3\\packages\\ssl_match_hostname\\__pycache__\\_implementation.cpython-34.pyc'
----- Failed to install 'pip' -----

The problem seems to be that the entire install path is replicated beneath TEMP, which very quickly exceeds the maximum path length supported by Windows. I guess the aim is to be able to rollback a failed uninstall (which also fails here, and leaves corrupt state), but we may need an alternative to including the full path - maybe generate some sort of map file as well?

Related to #2892

I am still experiencing the same issue even though I am using pip 7.1.2 and python 3.5.1.

Are there any updates available regarding this issue?

Just ran into this under pip 8.1.1 and Python 3.5.2 on Windows 10.

I am wondering if a work around for Windows users might not be to check for paths that are likely to exceed the limit, (260 chars total), and using subst to mount a drive at a suitable point, i.e. identify a spare drive letter and mount that letter to as the TEMP - of course this would not truly fix the problem and deeply nested packages could still hit problems - that is down to Microsoft actually addressing the MAXPATH issue once and for all. If the installation succeeds then the temporary area could be deleted and the subst deleted. In the case of a TEMP location if it fails then the path on the "new" drive could be used for recovery before deletion and the subst deleted. Of course it cannot help someone who already has 23 drive letters mapped A: B: & C: are reserved!

In the case of a virtual environment facing the issue I think that the best that can be done is to issue the user an error with a suggestion that they make the virtual environment with --relocatable and either move it higher up the directory tree or use subst to put it at the top of a "new" drive.

I have just done a quick test on a Windows 10 (Anniversary Update) 64 bit machine and it would quite happily allow me to create a directory 217 characters long, (at which point the next mkdir failed and then mount it with 'subst' to a new drive letter and create a path 117 characters long on that drive - all that I had time for.

zooba commented

FWIW, Windows 10 allows the option to disable MAX_PATH limits for "self-certified" applications, which will include Python 3.6. Currently there's a machine policy that needs to be enabled, and the Python 3.6 installer will prompt users to do that (if they can - need to be admin).

So the issue will go away in the future, and anything else we do now is mitigation for people on existing setups (which obviously has a lot of value, since it's going to be a long time before everyone is on Python 3.6 on Windows 10).

@zooba - My reading so far suggests that the disable MAX_PATH limits option is only planned to be available or possibly only accessible in Widows 10 Pro and Enterprise editions not the home edition.

zooba commented

@GadgetSteve Where are you reading that? It's highly unlikely that such core APIs would be different between Windows SKUs, though of course there's no support included in the home editions for managing group policy. Shouldn't prevent the registry edits from working though.

@zooba It looks like the only way to enable the change for the user is via the Group Policy Editor which is missing from the Home editions - I have had too many changes that should work via registry edits fail to work, revert on a MS Update or work erratically, (e.g. I cannot get the numb-lock to survive a reboot on Win 10 even after the registry edits), to be completely happy with the idea of relying on them.

@pfmoore Any ideas on what could be a possible workaround for this?

Thanks @GadgetSteve! I'll try to summarise what I've understood. Here's how one can workaround this issue as of today:

  • If you can use Python 3.6 and Windows 10, use them. Along with everything else, it has support for the long paths - since the Anniversary Update of Windows 10.
    • You might have to enable long path support on Windows 10 though. [instructions]
  • If you don't have the ability to use Windows 10 and Python 3.6, you'll have to workin the restrictions set by Windows:
    • You can use the --cache-dir option to set a cache directory that's shorter.
    • Set TEMP as a shorter path - like C:\Temp
  • You can try making a long path into a short one with the command line tools subst and mklink commands. (but don't)

Since this is basically fixed with the latest versions of the software involved (pip 9.0.1, Python 3.6.1, Windows 10 with Anniversary) - should there be changes in pip to workaround this issue?

zooba commented

I wouldn't suggest going down the subst or mklink routes at all. pip should just name its temp directory something other than the full path to the eventual/original install location.

The fix in Python 3.6 and Windows 10 is fine, but will never apply to all users.

Of course some deeply nested libraries may well end up hitting the problem with the actual files final location for user installs of python which is problematic, even if it never gets that far with the current process, maybe we also need a separate ticket for getting pip to auto-install to wheel with no override if platform is Windows and any of the resulting paths would exceed the limit.

I agree with @zooba - that's the form I'd expect such a patch to take.

If pip can't install to the final location (as @GadgetSteve suggests) that's not an issue for pip - we'll get an OS error and roll back the install, and it's then the user's issue to solve.

should there be changes in pip to workaround this issue?

Yes - a patch is needed that shortens pip's temporary directory paths.


FTR, "awaiting PR" is essentially for indicating that further discussion related to this issue should be deferred until someone comes around to make a PR.

Ah, OK. I've removed the label in that case, as there's no real need for any discussion. Do we not have a label for "Agreed to be a reasonable request, but won't go further until someone steps up with a PR"?

Maybe we should rename the current label to "deferred till PR" and add a
new label with the current name for this situation?

I've gone ahead and done this.

pip should just name its temp directory something other than the full path to the eventual/original install location.

Noting for whomsoever comes around to making a PR for this, the behaviour change would be to shorten pip's temporary directories by not being the full path to the install location.

Path Length Limitation

"\\?\D:\very long path"

Note The maximum path of 32,767 characters is approximate, because the "\?" prefix may be expanded to a longer string by the system at run time, and this expansion applies to the total length.

zooba commented

Question for @pradyunsg @pfmoore @dstufft: would you consider a change to try and use a random name in the target directory instead of TEMP? This has the nice advantage that we can simply rename a directory and then try to delete it, rather than doing multiple copies. When used on install, we can extract/install to the random name and then just rename the directory when it looks good, which drastically simplifies the process of copying permissions/etc., as they'll be inherited normally. And if the user doesn't have permissions to install, it'll fail much sooner.

In both cases, the random name can be the same length as the target name, which will ensure that path length issues don't get any worse than when the package is installed. I guess the downside is potentially leftover files in the install directory when pip hard crashes, but the upside is less crashing and significantly faster installs (I've prototyped some parts of this on Windows and we're talking at least 2x faster for big packages like Django).

Thoughts? Given one-off installs into virtual environments are becoming the norm, I think the risk of cruft being left behind in install directories is worth the other benefits.

Seems reasonable to me, I'd just say we should use a name that isn't importable as well.

zooba commented

Substituting the first character with a digit is probably an easy algorithm, at least for the first 10 attempts :)

Yea, or add a prefix like -pip-tmp or something.

zooba commented

I want to avoid generating a longer name, if possible. If we cross the 260 char barrier here, install will fail when the package would otherwise work (apart from caching pyc files... but oh well).

Ah right, that makes sense. Okay.

I think I'd use a leading - or something instead of a digit, just because it feel more purposeful to me? But that's kind of nitpicky, so just a suggestion.

zooba commented

I think it'll have to be an invalid character in a dist-info folder name as well, since those will need the same treatment. So probably I'm going to end up with a list of invalid package name characters that are valid directory name characters and swap them in until there's a free one. Trying - (or ~?) first is fine by me.

Presumably another option here is to just not put the entire path inside of the temporary directory as well? Like if there's some common prefix here, we should be able to just exclude that from the temporary directory path that we generate so we're not nesting things nearly as far.

That would turn something like:

C:\\Users\\TREVOR~1\\AppData\\Local\\Temp\\pip-h64zdfhc-uninstall\\users\\trevorsullivan\\source\\repos\\pythonapplication11\\pythonapplication11\\env\\lib\\site-packages\\pip\\_vendor\\requests\\packages\\urllib3\\packages\\ssl_match_hostname\\__pycache__\\_implementation.cpython-34.pyc

into

C:\\Users\\TREVOR~1\\AppData\\Local\\Temp\\pip-h64zdfhc-uninstall\\pip\\_vendor\\requests\\packages\\urllib3\\packages\\ssl_match_hostname\\__pycache__\\_implementation.cpython-34.pyc

It doesn't have the other benefits though (simple renames, etc) and it means it could still fail when this strategy works in the case the TMPDIR path is longer than the common prefix we'd remove. On the flip side, it does mean that it's probably a smaller delta in our current behavior which may be a "safer" change to make.

I don't feel strongly one way or another, but the idea popped into my head so figured I'd mention it.

zooba commented

Yeah, that is going to deal with the original problem here (most of the time). I think it was suggested a few times higher on the thread, and it's definitely a much simpler change.

That said, I think the uninstall case is going to be just as simple to rename and then rmtree, so the delta doesn't get much bigger. And the perf benefit to wheel extraction is worth doing in-place as well. Maybe I'll start with uninstall though and send two separate PRs.

I'm OK with the idea of a random name in the target. Just to be clear, we're talking about the temporary install directory that gets moved into place for the final install? Not the build directory?

Sounds OK to me as well.

zooba commented

Just to be clear, we're talking about the temporary install directory that gets moved into place for the final install? Not the build directory?

Correct. IIUC, eventually all installs will come from/via wheels, right? So the install side will only affect the wheel extraction directory, and shouldn't touch build at all.

Yep, under PEP 517 (now in master, but not yet used for all builds) we set up a build directory (which will still be in $TEMP) and call the backend to build a wheel. We then extract the wheel and move the extracted files. If we extract to a random-named directory in the target location and move, that will skip a whole copy step, which is a definite performance improvement[1]

Legacy non-PEP517 installs go direct via setup.py install, so there's no wheel or wheel extraction step, so they won't change (which is fine by me, as you say that code path is destined for removal anyway)

[1] We might get a double improvement - I don't know much about how anti-virus software works, but if it can recognise that a move doesn't require a new scan, that could mean that the proposed approach will remove an extra unneeded virus scan as well.

zooba commented

Pretty sure we'll get that double improvement :) I've been profiling with AV scanning enabled, and it seems to be triggered totally differently between copyfile (open/read/close/open/write/close) and rename.

lock commented

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.