/boiler-pytorch

Basic framework for model training and evaluation in PyTorch.

Primary LanguagePythonMIT LicenseMIT

boiler-pytorch

Basic framework for training stuff in PyTorch. It's quite tailored to projects I've been working on lately, so it's meant for personal use. Its sole purpose is to do away with boilrplate code, and having it here makes it easier to share it across projects.

Install

pip install boilr

Usage example/template

There's a usage example that can be useful as template. It's a basic VAE for MNIST quickly hacked together. The example files are:

  • example.py
  • example_evaluate.py
  • experiments/mnist_experiment/data.py
  • experiments/mnist_experiment/experiment_manager.py
  • models/mnist_vae.py

Install requirements and run the example:

pip install -r requirements.txt
CUDA_VISIBLE_DEVICES=0 python example.py

For evaluation:

CUDA_VISIBLE_DEVICES=0 python example_evaluate.py --ll --ll-samples 100 --load $RUN_NAME

using the name of the folder in output/ generated from running the example.

Quick reference

Built-in functionalities

The following functionalities are available out-of-the-box:

  • Easy logging of metrics to tensorboard and to a pickle file. Metrics are collected at every training step, smoothed, and logged/saved at a specified frequency. The amount of smoothing is also customizable.
  • Summaries of the metrics are automatically printed after each training and testing phase. This can be easily customized.
  • Training speed, gradient norm (global and per-parameter), and L2 norm of the model parameters are all automatically logged.
  • It's easy to save images from testing, in a dedicated folder.
  • Gradient clipping (by global norm), controllable through a command-line argument.
  • Automatic model checkpointing, with command-line argument to control the maximum number of recent checkpoints to be kept.
  • Command-line argument to resume training from checkpoint, and everything is taken care of.
  • Progress bar for training and testing, using tqdm. Can be switched off.
  • Data-dependent initialization (command-line argument).
  • Reproducibility: set random seed across all devices and Python libraries.
  • A suite of utility classes and methods in the packages boilr.nn and boilr.utils (most of them for internal use). In particular boilr.nn.modules and boilr.utils.viz might be more generally useful.
  • A long list of command-line arguments to control some of the behaviour above. Some arguments are not directly used, but it's convenient to have them already defined: e.g. if a custom DataLoader is necessary, the batch size is easily accessible with args.batch_size; and when creating the optimizer, the learning rate is args.lr.
  • See boilr.options for package-wide options. Usually it's not necessary to change them, but they give some more flexibility.

Command-line arguments

There are built-in command-line arguments with default values. These defaults can be easily overridden programmatically when making the experiment class that subclasses boilr's. The built-in arguments are the following:

  • batch-size: training batch size (default: None)
  • test-batch-size: test batch size (default: None)
  • lr: learning rate (default: None)
  • max-grad-norm: maximum global norm of the gradient. It is clipped if larger. If None, no clipping is performed. (default: None)
  • seed: random seed (default: 54321)
  • tr-log-every: log training metrics every this number of training steps (default: 1000)
  • ts-log-every: log test metrics every this number of training steps. It must be a multiple of --tr-log-every (default: 1000)
  • ts-img-every: save test images every this number of training steps. It must be a multiple of --ts-log-every (default: same as --ts-log-every)
  • checkpoint-every: save model checkpoint every this number of training steps (default: 1000)
  • keep-checkpoint-max: keep at most this number of most recent model checkpoints (default: 3)
  • max-steps: max number of training steps (default: 1e10)
  • max-epochs: max number of training epochs (default: 1e7)
  • nocuda: do not use cuda (default: False)
  • descr: additional description for experiment name
  • dry-run: do not save anything to disk (default: False)
  • resume: load the run with this name and resume training

Additionally, for VAEExperimentManager, the following arguments are available:

  • ll-every: evaluate log likelihood (with the importance-weighted bound) every this number of training steps (default: 50000)
  • ll-samples: number of importance-weighted samples to evaluate log likelihood (default: 100)

Getting started

  1. subclass a base dataset manager class;
  2. subclass a base model class;
  3. subclass a base experiment manager class (the model class is used in here);
  4. make a short script that creates the experiment object, uses it to create a boilr.Trainer, and runs the trainer;
  5. optionally, subclass the base evaluator to set up an "offline" evaluation pipeline.

See below for more details.

Dataset manager class (1)

The class boilr.data.BaseDatasetManager must be subclassed. The subclass must implement the method _make_datasets which should return a tuple (train, test) with the training and test sets as PyTorch Datasets. A basic implementation of _make_dataloaders is already provided, but can be overridden to make custom data loaders.

Model class (2)

One of the model classes must be subclassed to inherit core methods in the base implementation boilr.models.BaseModel. These models also automatically subclass torch.nn.Module (so it must implement forward). In addition, boilr.models.BaseGenerativeModel (subclassing BaseModel) defines a method sample_prior that must be implemented by subclasses.

Experiment manager class (3)

One of the base experiment classes in boilr.experiments must be subclassed. The subclass must implement:

  • _make_datamanager to create the dataset manager, which should subclass boilr.data.BaseDatasetManager;
  • _make_model to create the model, which should subclass boilr.models.BaseModel;
  • _make_optimizer to create the optimizer, which should subclass torch.optim.optimizer.Optimizer;
  • forward_pass to perform a simple single-pass model evaluation and returns losses and metrics;
  • test_procedure to evaluate the model on the test set (usually heavily based on the forward_pass method).

Typically should be overridden:

  • _define_args_defaults, _add_args, and _check_args (or a subset of these) to manage parsing of command-line arguments;
  • _make_run_description which returns a string description of the run, used for output folders;
  • save_images to save output images (e.g. reconstructions and samples in VAEs).

May be overridden for additional control:

  • post_backward_callback is called by the Trainer after the backward pass but before the optimization step;
  • get_metrics_dict translates a dictionary of results to a dictionary of metrics to be logged (by default this simply copies over the keys);
  • train_log_str and test_log_str return log strings for test and training metrics.

Note: The class VAEExperimentManager implements default test_procedure and save_images methods for variational inference with VAEs.

Example training script (4)

from boilr import Trainer
from my_experiment import MyExperimentClass

if __name__ == "__main__":
    experiment = MyExperimentClass()
    trainer = Trainer(experiment)
    trainer.run()

Offline evaluator class (5)

If offline evaluation is necessary, boilr.eval.BaseOfflineEvaluator can be subclassed by implementing:

  • run to run the evaluation;
  • as above, _define_args_defaults, _add_args, and _check_args (or a subset of these) to manage parsing of command-line arguments.

The method run can be executed by simply calling the evaluator object. See example_evaluate.py.

Notes

  • It also works without tensorboard, but it won't save tensorboard logs.