Beaker Gantry

Gantry streamlines running Python experiments in Beaker by managing containers and boilerplate for you

⚡️Easy to use

No Docker required! 🚫 🐳
No writing YAML experiment specs.
Easy setup.
Simple CLI.

🏎 Fast

Fire off Beaker experiments from your local computer instantly!
No local image build or upload.

🪶 Lightweight

Pure Python (built on top of beaker-py).
Minimal dependencies.

Who is this for?

Gantry is for both new and seasoned Beaker users who need to run Python batch jobs (as opposed to interactive sessions) from a rapidly changing repository. Without Gantry, this workflow usually looks like this:

Add a Dockerfile to your repository.
Build the Docker image locally.
Push the Docker image to Beaker.
Write a YAML Beaker experiment spec that points to the image you just uploaded.
Submit the experiment spec.
Make changes and repeat from step 2.

This requires experience with Docker, experience writing Beaker experiment specs, and a fast and reliable internet connection (a luxury that some of us don't have, especially in the WFH era 🙃).

With Gantry, on the other hand, that same workflow simplifies down to this:

Write a conda environment.yml file, or simply a PIP requirements.txt and/or setup.py file.
Commit and push your changes.
Submit and track a Beaker experiment with the gantry run command.
Make changes and repeat from step 2.

In this README

💾 Installing
🚀 Quick start
👓 Best practices
❓ FAQ

Additional info

👋 Examples

Savings results / metrics from an experiment

💻 For developers

Installing

Installing with `pip`

Gantry is available on PyPI. Just run

pip install beaker-gantry

Installing from source

To install Gantry from source, first clone the repository:

git clone https://github.com/allenai/beaker-gantry.git
cd beaker-gantry

Then run

pip install -e .

Quick start

One-time setup

Create and clone your repository.

If you haven't already done so, create a GitHub repository for your project and clone it locally. Every gantry command you run must be invoked from the root directory of your repository.
Configure Gantry.

If you've already configured the Beaker command-line client, Gantry will find and use the existing configuration file (usually located at $HOME/.beaker/config.yml). Otherwise just set the environment variable BEAKER_TOKEN to your Beaker user token.

The first time you call gantry run ... you'll also be prompted to provide a GitHub personal access token with the repo scope if your repository is not public. This allows Gantry to clone your private repository when it runs in Beaker. You don't have to do this just yet (Gantry will prompt you for it), but if you need to update this token later you can use the gantry config set-gh-token command.
Specify your Python environment.

Lastly - and this is the most important part - you'll have to create one of several different files that specify your Python environment. There are three options:
1. A conda environment.yml file.
2. A setup.py file.
3. A PIP requirements.txt file.
The first method is the recommended approach, especially if you're already using conda. But it's perfectly okay to use a combination of these different approaches as well. This can be useful when, for example, you need to use a CUDA-enabled version of PyTorch on Beaker but a CPU-only version locally.

Submit your first experiment with Gantry

Let's spin up a Beaker experiment that just prints "Hello, World!" from Python.

First make sure you've committed and pushed all changes so far in your repository. Then (from the root of your repository) run:

gantry run --workspace {WORKSPACE} --cluster {CLUSTER} -- python -c 'print("Hello, World!")'

Just replace {WORKSPACE} with the name of your own Beaker workspace and {CLUSTER} with the name of the Beaker cluster you want to run on.

❗Note: Everything after the -- is the command + arguments you want to run on Beaker. It's necessary to include the -- if any of your arguments look like options themselves (like -c in this example) so gantry can differentiate them from its own options.

Try gantry run --help to see all of the available options.

Best practices

Limit the scope and lifetime of your GitHub token

Your GitHub personal access token (PAT) only needs to have the repo scope and should have a short expiration time (e.g. 30 days). This limits the harm a bad actor could cause if they were able to read your PAT from your Beaker workspace somehow.

Use conda

Adding a conda environment file will generally make your exact Python environment easier to reproduce, especially when you have platform-dependent requirements like PyTorch. You don't necessarily need to write the environment.yml file manually either. If you've already initialized a conda environment locally, you can just run:

conda env export --from-history

See Exporting an Environment File Across Platforms for more details.

It's also okay to use a combination of conda environment and PIP requirements files.

FAQ

Can I use my own Docker/Beaker image?

You sure can! Just set the --beaker-image or --docker-image flag.

Gantry can use any image that has bash installed. This can be useful when you have dependencies that take a long time to download and build (like PyTorch).

In this case it works best if you build your image with a conda environment that already has your big dependencies installed. Then when you call gantry run, use the --venv option to tell Gantry to use that environment instead of creating a new conda environment in the container. You may also want to add a requirements.txt file to your repository that lists all of your dependencies (including PyTorch and anything else already installed in your image's conda environment) so Gantry can make sure the environment on the image is up-to-date when it runs.

For example, you could use one of our pre-built PyTorch images, such as ai2/pytorch1.11.0-cuda11.3-python3.9, like this:

gantry run \
    --beaker-image 'ai2/pytorch1.11.0-cuda11.3-python3.9' \
    --venv 'base' \
    --pip requirements.txt \
    -- python -c 'print("Hello, World!")'

Will Gantry work for GPU experiments?

Absolutely! This was the main use-case Gantry was developed for. Just set the --gpus option for gantry run to the number of GPUs you need. You should also ensure that the way in which you specify your Python environment (e.g. conda environment.yml, setup.py, or PIP requirements.txt file) will lead to your dependencies being properly installed to support the GPU hardware specific to the cluster you're running on.

For example, if one of your dependencies is PyTorch, you're probably best off writing a conda environment.yml file since conda is the preferred way to install PyTorch. You'll generally want to use the latest supported CUDA version, so in this case your environment.yml file could look like this:

name: torch-env
channels:
- pytorch
dependencies:
- python=3.9
- cudatoolkit=11.3
- numpy
- pytorch
- ...

Can I use both conda environment and PIP requirements files?

Yes you can. Gantry will initialize your environment using your conda environment file (if you have one) and then will also check for a PIP requirements file.

How do I use a CUDA-enabled version of PyTorch on Beaker when I'm using a CPU-only version locally?

One way to handle this would be to start with a requirements.txt that lists the torch version you need along with any other dependencies, e.g.

# requirements.txt
torch==1.11.0
...

Then add a conda environment.yml somewhere in your repository that specifies exactly how to install PyTorch (and a CUDA toolkit) on Beaker, e.g.:

# beaker/environment.yml
name: torch-env
channels:
- pytorch
dependencies:
- python=3.9
- cudatoolkit=11.3
- pytorch==1.11.0  # make sure this matches the version in requirements.txt

When you call gantry run, use the --conda flag to specify the path to your conda env file (e.g. --conda beaker/environment.yml). Gantry will use that env file to initialize the environment, and then will install the rest of your dependencies from the requirements.txt file.

How can I save results or metrics from an experiment?

By default Gantry uses the /results directory on the image as the location of the results dataset. That means that everything your experiment writes to this directory will be persisted as a Beaker dataset when the experiment finalizes. And you can also create Beaker metrics for your experiment by writing a JSON file called metrics.json in the /results directory.

Can I access data on NFS?

Yes. When you choose an on-premise cluster managed by the Beaker team that supports the NFS drive it will be automatically attached to the experiment's container.

How can I just see the Beaker experiment spec that Gantry uses?

You can use the --dry-run option with gantry run to see what Gantry will submit without actually submitting an experiment. You can also use --save-spec PATH in combination with --dry-run to save the actual experiment spec to a YAML file.

How can I update Gantry's GitHub token?

Just use the command gantry config set-gh-token.

How can I attach Beaker datasets to an experiment?

Just use the --dataset option for gantry run. For example:

gantry run --dataset 'petew/squad-train:/input-data' -- ls /input-data

How can I run distributed batch jobs with Gantry?

The three options --replicas (int), --leader-selection (flag), and --host-networking (flag) used together give you the ability to run distributed batch jobs. See the Beaker docs for more information.

Why "Gantry"?

A gantry is a structure that's used, among other things, to lift containers off of ships. Analogously Beaker Gantry's purpose is to lift Docker containers (or at least the management of Docker containers) away from users.

allenai/beaker-gantry