Peru is a tool for including other people's code in your projects. It fetches from anywhere -- git, hg, svn, tarballs -- and puts files wherever you like. Peru helps you track exact versions of your dependencies, so that your history is always reproducible. And it fits inside your scripts and Makefiles, so your build stays simple and foolproof.
If you build with make
, you don't have to do anything special when you
switch branches or pull new commits. Build tools notice those changes
without any help. But if you depend on other people's code, the tools
aren't so automatic anymore. You need to remember when to git submodule update
, or when to refresh your virtualenv
. If you forget a step you
can break your build, or worse, you might build something wrong without
noticing.
Peru wants you to automate dependency management just like you automate the rest of your build. It doesn't interfere with your repo or install anything global, so you can just throw it in at the start of a script and forget about it. It'll run every time, and your dependencies will never be out of sync. Simple, and fast as heck.
The name "peru", along with our love for reproducible builds, was inspired by Amazon's Brazil build system. It also happens to be an anagram for "reup".
Peru supports Linux, Mac, and Windows. It requires python (3.3 or later) and git, and optionally hg and svn if you want fetch from those types of repos. Use pip to install it:
pip install peru
Note that depending on how Python is set up on your machine, you might
need to use sudo
with that, and Python 3's pip might be called pip3
.
On Ubuntu, you can install also peru
from our
PPA. On Arch,
you can install peru-git
from the
AUR.
Here's the peru version of the first git submodules
example
from the Git Book. We're going to add the Rack
library to our project. First, create a peru.yaml
file like this:
imports:
rack_example: rack/ # This is where we want peru to put the module.
git module rack_example:
url: git://github.com/chneukirchen/rack.git
Now run peru sync
.
Peru cloned Rack for you, and imported a copy of it under the rack
directory.
It also created a magical directory called .peru
to hold that clone and some
other business. If you're using source control, now would be a good time to put
these directories in your ignore list (like .gitignore
). You usually don't
want to check them in.
Running peru clean
will make the imported directory disappear. Running peru sync
again will make it come back, and it'll be a lot faster this time,
because peru caches everything.
For a more involved example, let's use peru to manage some dotfiles. We're big
fans of the Solarized colorscheme, and
we want to get it working in both ls
and vim
. For ls
all we need peru to
do is fetch a Solarized dircolors file. (That'll get loaded somewhere like
.bashrc
, not included in this example.) For vim
we're going to need the
Solarized vim plugin,
and we also want Pathogen, which makes
plugin installation much cleaner. Here's the peru.yaml
:
imports:
# The dircolors file just goes at the root of our project.
dircolors: ./
# We're going to merge Pathogen's autoload directory into our own.
pathogen: .vim/autoload/
# The Solarized plugin gets its own directory, where Pathogen expects it.
vim-solarized: .vim/bundle/solarized/
git module dircolors:
url: https://github.com/seebi/dircolors-solarized
# Only copy this file. Can be a list of files. Accepts * and ** globs.
pick: dircolors.ansi-dark
curl module pathogen:
url: https://codeload.github.com/tpope/vim-pathogen/tar.gz/v2.3
# Untar the archive after fetching.
unpack: tar
# After the unpack, use this subdirectory as the root of the module.
export: vim-pathogen-2.3/autoload/
git module vim-solarized:
url: https://github.com/altercation/vim-colors-solarized
# Always fetch this exact commit, instead of master.
rev: 7a7e5c8818d717084730133ed6b84a3ffc9d0447
The contents of the dircolors
module are copied to the root of our repo. The
pick
field restricts this to just one file, dircolors.ansi-dark
.
The pathogen
module uses the curl
type instead of git
, and its URL points
to a tarball. (This is for the sake of an example. In real life you'd probably
use git
here too.) The unpack
field means that we get the contents of the
tarball rather than the tarball file itself. Because the module specifies an
export
directory, it's that directory rather than the whole module that gets
copied to the import path, .vim/autoload
. The result is that Pathogen's
autoload
directory gets merged with our own, which is the standard way to
install Pathogen.
The vim-solarized
module gets copied into its own directory under bundle
,
which is where Pathogen will look for it. Note that it has an explicit rev
field, which tells peru to fetch that exact revision, rather than the the
default branch (master
in git). That's a Super Serious Best Practice™,
because it means your dependencies will always be consistent, even when you
look at commits from a long time ago.
You really want all of your dependencies to have explicit hashes, but editing those by hand is painful, especially if you have a lot of dependencies. The next section is about making that easier.
If you run peru reup
, peru will talk to each of your upstream repos, get
their latest versions, and then edit your peru.yaml
file with any updates. If
you don't have peru.yaml
checked into some kind of source control, you should
probably do that first, because the reup will modify it in place. When we reup
the example above, the changes look something like this:
diff --git a/peru.yaml b/peru.yaml
index 15c758d..7f0e26b 100644
--- a/peru.yaml
+++ b/peru.yaml
@@ -6,12 +6,14 @@ imports:
git module dircolors:
url: https://github.com/seebi/dircolors-solarized
pick: dircolors.ansi-dark
+ rev: a5e130c642e45323a22226f331cb60fd37ce564f
curl module pathogen:
url: https://codeload.github.com/tpope/vim-pathogen/tar.gz/v2.3
unpack: tar
export: vim-pathogen-2.3/autoload/
+ sha1: 9c3fd6d9891bfe2cd3ed3ddc9ffe5f3fccb72b6a
git module vim-solarized:
url: https://github.com/altercation/vim-colors-solarized
- rev: 7a7e5c8818d717084730133ed6b84a3ffc9d0447
+ rev: 528a59f26d12278698bb946f8fb82a63711eec21
Peru made three changes:
- The
dircolors
module, which didn't have arev
before, just got one. By default forgit
, this is the currentmaster
. To change that, you can set thereup
field to the name of a different branch. - The
pathogen
module got asha1
field. Unlikegit
, acurl
module is plain old HTTP, so it's stuck downloading whatever file is at theurl
. But it will check this hash after the download is finished, and it will raise an error if there's a mismatch. - The
vim-solarized
module had a hash before, but it's been updated. Again, the new value came frommaster
by default.
At this point, you'll probably want to make a new commit of peru.yaml
to
record the version bumps. You can do this every so often to keep your plugins
up to date, and you'll still be able to reach old versions in your history.
sync
- Pull in your imports.
sync
yells at you instead of overwriting existing or modified files. Use--force
/-f
to tell it you're serious.
- Pull in your imports.
clean
- Remove imported files. Same
--force
/-f
flag assync
.
- Remove imported files. Same
reup
- Update module fields with new revision information. For
git
,hg
, andsvn
, this updates therev
field. Forcurl
, this sets thesha1
field. You can optionally give specific module names as arguments.
- Update module fields with new revision information. For
copy
- Make a copy of all the files in a module. Either specify a directory to put them in, or peru will create a temp dir for you. You can use this to see modules you don't normally import, or to play with different module/rule combinations (see "Rules" below).
override
- Replace the contents of a module with a local directory path, usually a
clone you've made of the same repo. This lets you test changes to imported
modules without needing to push your changes upstream or edit
peru.yaml
.
- Replace the contents of a module with a local directory path, usually a
clone you've made of the same repo. This lets you test changes to imported
modules without needing to push your changes upstream or edit
For cloning repos. These types all provide the same fields:
url
: required, any protocol supported by the underlying VCSrev
: optional, the specific revision/branch/tag to fetchreup
: optional, the branch/tag to get the latest rev from when runningperu reup
For downloading a file from a URL. This type is powered by Pythons's standard library, rather than an external program.
url
: required, any kind supported byurllib
(HTTP, FTP,file://
)filename
: optional, overrides the default filenamesha1
: optional, checks that the downloaded file matches the checksumunpack
: optional,tar
orzip
Peru includes a few other types mostly for testing purposes. See rsync
for an
example implemented in Bash.
Module type plugins are as-dumb-as-possible scripts that only know how to sync, and optionally reup. Peru shells out to them and then handles most of the caching magic itself, though plugins can also do their own caching as appropriate. For example, the git and hg plugins keep track of repos they clone. Peru itself doesn't need to know how to do that. For all the details, see Architecture: Plugins.
Some fields (like url
and rev
) are specific to certain module types.
There are also fields you can use in any module, which modify the the
tree of files after it's fetched. Some of these made an appearance in
the fancy example above:
copy
: A map or multimap of source and destination paths to copy. Works likecp
on the command line, so if the destination is a directory, it'll preserve the source filename and copy into the destination directory.move
: A map or multimap of source and destination paths to move. Similar tocopy
above, but removes the source.pick
: A file or directory, or a list of files and directories, to include in the module. Everything else is dropped. Paths can contain*
or**
globs.executable
: A file or list of files to make executable, as if callingchmod +x
. Also accepts globs.export
: A subdirectory that peru should treat as the root of the module tree. Everything else is dropped, including parent directories.
Besides using those fields in your modules, you can also use them in "named
rules", which let you transform one module in multiple ways. For example, say
you want the asyncio
subdir from the Tulip project, but you also want the
license file somewhere else. Rather than defining the same module twice, you
can use one module and two named rules, like this:
imports:
tulip|asyncio: python/asyncio/
tulip|license: licenses/
hg module tulip:
url: https://code.google.com/p/tulip/
rule asyncio:
export: asyncio/
rule license:
pick: COPYING
As in the example above, named rules are declared a lot like modules and then
used in the imports
list, with the syntax module|rule
. The |
operator
there works kind of like a shell pipeline, so you can even do twisted things
like module|rule1|rule2
, with each rule applying to the output tree of the
previous.
If you import a module that has a peru file of its own, peru will
automatically sync that module's imports as part of yours. It's also
possible to refer directly to the modules that another module defines.
For example if your project defines module foo
, and foo
has a peru
file that defines module bar
, then in your project you can import
foo.bar
.
There are several flags and environment variables you can set, to control where peru puts things. Flags always take precedence.
--file=<file>
: The path to your peru YAML file. By default peru looks forperu.yaml
in the current directory or one of its parents. This setting tells peru to use a specific file. If set,--sync-dir
must also be set.--sync-dir=<dir>
: The path that allimports
are interpreted relative to. That is, if you import a module to./
, the contents of that module go directly in the sync dir. By default this is the directory containing yourperu.yaml
file. If set,--file
must also be set.--state-dir=<dir>
: The directory where peru stashes all of its state metadata, and also the parent of the cache dir. By default this is.peru
inside the sync dir. You should not share this directory between two projects, orperu sync
will get confused.--cache-dir=<dir>
orPERU_CACHE_DIR
: The directory where peru keeps everything it's fetched. If you have many copies of the same project, for example on a server running automated tests, you can using a shared cache to speed up syncs.--file-basename=<name>
: Changes the default name forperu.yaml
without providing a full path. Peru will search the current directory and its parents for a file of the right name, and it will use that file's parent as the sync dir, as usual. Incompatible with--file
.