Install packages in parallel
Opened this issue · 3 comments
What's the problem this feature will solve?
This is to improve the performance of pip.
For example looking at #12613 (comment) of a large install, even with resolving, downloading, buildising sdists, installing takes over 8% of the time. As resolving becomes faster, downloads are run in parallel, and hopefully there are more wheels instead of sdists then installing will become a larger part of the total time.
Describe the solution you'd like
After the resolve, downloads, and sdist build has completed, the installs could run in parallel.
Alternative Solutions
Keep as is.
Additional context
This would require a PR from someone obviously, I think there would need to make sure there are a complement of tests about installing packages in parallel, and different packages (e.g. make sure multiple editables run at the same time, editables and regular installs, etc.).
uv has already implemented this succesfully, following their issue tracker this has been the last problematic part of making things parallel/concurrent.
Code of Conduct
- I agree to follow the PSF Code of Conduct.
See also #8187 (comment).
See also #8187 (comment).
Thanks, hadn't seen that before, I'll have a good read through and see if this is a straight up duplicate, and if anything can be done to take the existing work to be landed in pip.
As the author of the linked comment, I'll add that the key new development is that uv
has implemented parallel installs. It would be interesting to know how they designed things. It's quite possible that pip could learn some useful lessons.
I've not looked at how uv
implements this at all, so the following is pure speculation, but if I had to guess, I'd imagine they have the following things in their favour:
- They may well have designed from the start for parallel tasks. One concern I have for pip is getting reporting right, for instance, because we have1 some stateful code that handles getting indentation correct, that might be broken by multiple threads.
- Rust has better thread safety than Python, so there's likely a class of issues that
uv
simply can't encounter (at least, not by accident). - To be blunt, they may just not have worried about pathological cases. For example, installing two wheels in parallel, which both contain the same filename but with different content, is a potential race condition (writing the file itself and
RECORD
). But it's unlikely in practice, so maybeuv
ignored the possibility. Pip has a larger user base, and a longer history of dealing with weird errors, so we may well simply be (for better or worse) more paranoid over things like this.
Footnotes
-
Or at least we used to, I haven't looked at that code since we started using rich... ↩