pypa/pip

Speedup startup time

pradyunsg opened this issue · 44 comments

There's a lot to gain from speeding up pip's startup time.

For one, pip takes around 600ms to just print the completion text, which is laggy. (as mentioned in #4755). Further, faster startup time might help with the test-suite situation too.

floam commented

I did notice one can save about 80ms by using the option to disable the version check - it'd probably make sense not to do that at all for completions by default. Still, a lot remains to be improved.

I wonder how much of this time is taken up by importing stuff.

$ time .tox/py36/bin/python -c "from pip._internal import main"

A rudimentary test of running the above 5 times gives me an average of 0.377s.

Importing pip._vendor.pkg_resources is what's taking most of the time on my machine. They are a few changes in setuptools 36.4 and above that will help a little: https://github.com/pypa/setuptools/blob/master/CHANGES.rst#v3640

master is vendoring setuptools 36.4.0 currently...

setuptools==36.4.0

Yes, and it's faster than pip 9.0, but there are some further changes in setuptools 36.5 that might help: https://github.com/pypa/setuptools/blob/master/CHANGES.rst#v3650.

  • pip 9.0: python -c 'import pip._vendor.pkg_resources' 0.35s user 0.03s system 100% cpu 0.385 total
  • master: python -c "from pip._vendor import pkg_resources" 0.21s user 0.02s system 99% cpu 0.233 total
  • master+updated pkg_resources: python -c "from pip._vendor import pkg_resources" 0.19s user 0.00s system 99% cpu 0.192 total

Oh, nice. That means the next round of vendor updates would bring some speedup. :)

I, personally, am waiting on a new distlib release before giving the vendored libraries another round of updates.

I fired up the profiler and ran pip completion --fish. Here's what I got (all percentages in terms of total time):

  • initial import of pip._internal.__init__: 85%
    • pip._internal.cmdoptions.__init__: 79%
      • pip._internal.index: 78% (basically all of the above time is spent in this import)
        • This is where it gets interesting
        • pip._vendor.html5lib: 15.8%
        • pip._vendor.requests: 9.0%
        • pip._vendor.distlib: 5.2%
        • pip._vendor.packaging: 7.1%
        • pip._internal.download: 35.4%
          • pip._internal.utils.logging: 30.4%
            • pip._internal.utils.misc: 29.6%
              • pip._vendor.pkg_resources: 28.8%
  • pip._internal.__init__.main(): 15%
    • parseopts(): 0.19%
    • command.main(): 14.7%

PS: Need better profiling tools.

Lazy importing will probably solve some of those.

There are plans to (hopefully) make lazy importing easy to switch on for CLI apps like pip in Python 3.7. There is also now a -X importtime argument to CPython 3.7 as well as dtrace/systemtap support to help track where import time is going to help profile this sort of thing.

boxed commented

@benoit-pierre

master+updated pkg_resources: python -c "from pip._vendor import pkg_resources" 0.19s user 0.00s system 99% cpu 0.192 total

What does "updated pkg_resources" mean? On my machine the initial import is still quite slow even though I have setuptools 38.1.0, so your comment seems very interesting to me! :P

@boxed: it means with pip's vendored version of setuptools updated.

boxed commented

Aha. I tried copying over pkg_resources from my main install over the one inside pip/_vendor, but I didn't see any difference in speed :/

Using CPython 3.7.0's -X importtime.

import time: self [us] | cumulative | imported package
[snip]
import time:       654 |     506881 | pip._internal
[snip]

Just in case y'all are unaware, pkg_resources is tracking the slowness in pypa/setuptools#510, I didn't see it linked in this issue yet.

Could you not move the imports from the top of the file
to the function, that needs it?

boxed commented

@CSDUMMI I have a PR that does this. It helps somewhat.

Could I have a link?

Here is a PR that improves the import situation for the vcs imports: #6545 It removes the pip._internal.vcs imports from pip/internal/__init__.py. This will make it easy to remove vcs imports from the common case, if desired, which can be done in a subsequent commit.

Is an external CLI framework an option?

@CSDUMMI What do you mean by external CLI framework?

boxed commented

@CSDUMMI I'd say no, not at this time at least. But I do have some ideas for that at https://github.com/boxed/p but it's way too early for that.

@pradyunsg
I did such a thing a few weeks ago: https://github.com/CSDUMMI/cli_framework,
this is not optimized of course.
But I think of something like this:
https://oclif.io/

@boxed Why is it not yet appropriate?

boxed commented

@CSDUMMI I think both oclif and your code have the wrong idea, where git has the right idea (for this purpose at least). OCLIF is the more egregious to me when they say "it's open!" but then it's JS only :( My project is a general dispatcher/configger thingie that is language agnostic.

I don't think it's appropriate yet because there is no existing solid such system. We should absolutely build one, but we should build it separately and use it to wrap pip, pytest, venv, etc to make a nice python development workflow and when it's ready we can convince people to use it.

And how do you think should we build such a thing?
What is the right idea git has?

@boxed I have to say your idea in p is very good, but your code needs more comments and documentation in general.

boxed commented

It needs to be used and polished most of all I think. But yea docs is also super important.

@boxed @CSDUMMI Could you please move this conversation elsewhere? It's unrelated to this issue's topic.

But what is wrong with my project? What is the right idea git has?

@pradyunsg I think so too.
That's why I want to know why we shouldn't move
the implementation of the CLI optimization to another place, too!

boxed commented

Let's move the discussion to boxed/p#4

FYI, PR #6694 ("Only import a Command class when needed") was recently merged, which helps with this.

I posted PR #6835 to help with this.

I just posted PR #6843 to continue the work in PR #6835. The PR trims unneeded imports by making it so that commands not requiring downloading / PackageFinder will no longer import that machinery.

I noticed a pretty significant slowdown in the latest released version -- I've tracked it down to here -- might be worth bumping pyparsing once that gets resolved: pyparsing/pyparsing#362

I also looked into pip startup time a bit. First thing I noticed is tenacity's import of asyncio but seems like @ichard26 already took care of it (thanks!).

Another one I noticed is chardet import. requests supports either chardet or charset_normalizer. From a quick experiment I did replacing the chardet vendor import with non-vendored charset_normalizer import, I get ~27ms for chardet vs. ~7ms for charset_normalizer, using -X importtime on my admittedly ~10 years old laptop.

If there is interest I can try to prepare a PR to replace the charset vendor with charset_normalizer vendor.

Note that we can only vendor pure Python libraries. charset_normalizer would at the very least be tricky to vendor because we'd need to find a way to make our vendoring tools ignore the platform-specific wheels. Also, are your benchmarks using the pure Python version? If not, they would need to be re-done to be meaningful.

I don't have any strong opinions on whether we should switch, I'm just noting these points as things to consider if we do.

Hmm the charset_normalizer github repo tagline says "in pure python" but I guess it's not :)
I just tried again with charset_normalizer-3.3.2-py3-none-any.whl and still get ~7ms so the pure python version looks good as well.

charset_normalizer was added as an optional dependency to requests when the apache-airflow team found the license for chardet wasn't suitable for them.

The developer did a lot to make it more acceptable to the requests maintainers, such as significantly reducing the amount of dependencies. My understanding is charset_normalizer only has binaries based on compiling pure Python code with mypyc, and isn't shipped by default.

If the developer is still as accommodating, I'd imagine pip would benefit in performance and ease of maintenance with charset_normalizer, but the first thing I would do is check with the developer that they are happy being vendored by pip.

OK, submitted PR #12638.

BTW, another big import-time hit is packaging -> pyparsing package. I see that packaging has already replaced pyparsing with a hand-rolled parser, and there is a PR #12300 to update pip to use it. I checked the import time with #12300 and it's indeed a nice improvement.

With pyparsing, asyncio and chardet gone it will be a decent improvement to startup time. The remaining big ones are rich and requests/urllib3 but I don't think there is much to do about these for pip install.