pip-faster is like pip, but faster.
This project ships two distinct components:
-
venv-update, a small script designed to keep a virtualenv in sync with a changing list of requirements.
Given a list of
requirements.txt
files, venv-update makes sure the virtualenv state is exactly the same as if you deleted and regenerated the virtualenv (but does so much more quickly).venv-update can also be used by tools like tox during testing.
-
pip-faster, a drop-in replacement for pip which tries hard to make package installation as fast as possible while preserving pip's semantics.
Both components are designed for use on large projects with hundreds of requirements and are used daily by Yelp engineers.
venv-update is a small script whose job is to idempotently ensure the existence of a project's virtualenv based on a set of requirements files.
We like to call venv-update from our Makefiles to create and maintain a virtualenv. It does the following:
-
Ensures a virtualenv exists at the specified location with the specified Python version, and that it is valid. It will create or recreate a virtualenv as necessary to ensure that one venv-update invocation is all that's needed.
-
Calculates the difference in packages derived from the
requirements.txt
files and the installed packages. Packages will be uninstalled, upgraded, or installed as necessary.The goal is that venv-update will put you in the same state as if you wipe away your virtualenv and rebuild it with
pip install
, but much more quickly. -
Takes advantage of
pip-faster
for package installation (see below) to avoid network access and rebuilding packages as much as possible.
For reference, a project with 250 dependencies which are all pinned can run a no-op venv-update in ~2 seconds with no network access. The running time when changes are needed is dominated by the time it takes to download and install packages, but is generally quite fast (on the order of ~10 seconds).
Because this tool is meant to be the entry-point for handling requirements and dependencies, it's not meant to be installed via pip; that would require developers to first create a virtualenv, which defeats the entire purpose of the project.
Instead, the venv_update.py
script is designed to be vendored (directly
checked in) to your project, and has no dependencies besides virtualenv and the
standard Python library.
Simply running venv_update.py
will create a virtualenv named venv
in the
current directory, using requirements.txt
in the current directory. You can
pass additional options to both virtualenv
and pip
. A typical invocation
looks something like this:
./venv_update.py venv-name -- -r requirements.txt -r requirements-dev.txt
Arguments to virtualenv
should go before the --
; arguments to pip
should
go after it.
venv-update is a good fit for use with make because it is idempotent and should never normally fail. Here's an example Makefile:
VENV := venv
$(VENV): requirements.txt requirements-dev.txt
./venv_update.py $(VENV) -- -r requirements.txt -r requirements-dev.txt
.PHONY: run-some-script
run-some-script: $(VENV)
$(VENV)/bin/some-script
tox is a useful tool for testing libraries against multiple versions of the Python interpreter. You can speed it up by telling it to use venv-update for dependency installation; not only will it avoid network access and prefer wheels, but it's also better at syncing a virtualenv (whereas tox will often throw out an entire virtualenv and start over).
To start using venv-update inside tox, copy the venv-update script into
your project (for example, at bin/venv-update
).
Then, apply a change like this to your tox.ini
file:
[testenv]
+ venv_update = {toxinidir}/bin/venv-update {envdir} -- -r {toxinidir}/requirements.txt -e {toxinidir}
- deps = -rrequirements.txt
commands =
+ {[testenv]venv_update}
py.test tests/
pre-commit run --all-files
The exact changes will vary slightly by project, but the above is a general
template. The most important part is running venv-update as the first test
command and removing the list of deps
(so that tox will never invalidate
your virtualenv itself; we want to let venv-update manage that instead).
Users of tox version <2 will want to add this as well, to avoid tox installing all your dependencies with pip-slower:
[tox]
envlist = py27,py34
+ skipsdist = true
pip-faster is designed to act as a drop-in replacement for pip. It supports all of the same arguments as pip (calling pip internally for most tasks).
Package installation is the most heavily-optimized area:
-
We've taken great pains to reduce the number of round-trips to PyPI, which makes up the majority of time spent on what should be a no-op update. For example, if you're installing a specific version of a package which we already have cached, there's no need to talk to PyPI, but vanilla pip will.
-
Packages are downloaded and wheeled before installation (if they aren't available from PyPI as wheels). If the virtualenv needs to be rebuilt, or you use the same requirement in another project, the wheel can be reused. This greatly speeds up installation of projects like lxml or numpy which have a slow-to-compile binary component.
One behavior difference between stock pip is that pip-faster will refuse to installing package versions which conflict (we generally consider this a feature); stock pip, on the other hand, will happily install conflicting packages.
For interactive use, you can normally just pip install pip-faster
the same
way you would any other Python tool.
If you're only using venv-update, it's not necessary to install pip-faster; the venv-update script will install the correct version inside your virtualenv for you.
In almost all cases, performance will be much better if you use an internal PyPI server instead of the public PyPI.
Besides the potentially lesser latency, an internal PyPI server allows for uploading binary wheels compiled for Linux. Unlike OS X or Windows, installing projects like lxml on Linux is normally extremely slow since they will need to be compiled during every installation.
pip-faster improves this by only compiling on the first installation for each user (this is also the default behavior for pip >= 6), but this doesn't help for the first run.
Using an internal PyPI server which allows uploading of Linux wheels can improve speed greatly. Unfortunately, these wheels are guaranteed compatible only with the same Linux distribution they were compiled on, so this only works if your developers work in very homogeneous environments.
For both venv-update and pip-faster, you can specify an index server by setting
the PIP_INDEX_URL
environment variable (or PIP_EXTRA_INDEX_URL
if you
want to supplement but not replace the default PyPI). For pip-faster you can
also use -i
or -e
, just like in regular pip.