Lightning-Hydra

🚡 Let's learn how to use lightning-hydra-template!

Introduction

PyTorch Lightning: a lightweight PyTorch wrapper for high-performance AI research. Think of it as a framework for organizing your PyTorch code.
Hydra: a framework for elegantly configuring complex applications. The key feature is the ability to dynamically create a hierarchical configuration by composition and override it through config files and the command line.

Project Structure

├── configs                   <- Hydra configuration files
│   ├── callbacks                <- Callbacks configs
│   ├── datamodule               <- Datamodule configs
│   ├── debug                    <- Debugging configs
│   ├── experiment               <- Experiment configs
│   ├── extras                   <- Extra utilities configs
│   ├── hparams_search           <- Hyperparameter search configs
│   ├── hydra                    <- Hydra configs
│   ├── local                    <- Local configs
│   ├── logger                   <- Logger configs
│   ├── model                    <- Model configs
│   ├── paths                    <- Project paths configs
│   ├── trainer                  <- Trainer configs
│   │
│   ├── eval.yaml             <- Main config for evaluation
│   └── train.yaml            <- Main config for training
│
├── data                   <- Project data
│
├── logs                   <- Logs generated by hydra and lightning loggers
│
├── notebooks              <- Jupyter notebooks. Naming convention is a number (for ordering),
│                             the creator's initials, and a short `-` delimited description,
│                             e.g. `1.0-jqp-initial-data-exploration.ipynb`.
│
├── scripts                <- Shell scripts
│
├── src                    <- Source code
│   ├── datamodules              <- Lightning datamodules
│   ├── models                   <- Lightning models
│   ├── utils                    <- Utility scripts
│   │
│   ├── eval.py                  <- Run evaluation
│   └── train.py                 <- Run training
│
├── tests                  <- Tests of any kind
│
├── .env.example              <- Example of file for storing private environment variables
├── .gitignore                <- List of files ignored by git
├── .pre-commit-config.yaml   <- Configuration of pre-commit hooks for code formatting
├── Makefile                  <- Makefile with commands like `make train` or `make test`
├── pyproject.toml            <- Configuration options for testing and linting
├── requirements.txt          <- File for installing python dependencies
├── setup.py                  <- File for installing project as a package
└── README.md

Command Line

Config parameter python train.py trainer.max_epochs=20 model.optimizer.lr=1e-4
Add new parameters python train.py +model.new_param="owo"
Training device python train.py trainer=gpu
Train with mixed precision python train.py trainer=gpu +trainer.precision=16
Train model with configs/experiment/example.yaml python train.py experiment=example
Resume ckpt python train.py ckpt_path="/path/to/ckpt/name.ckpt"
Evaluate ckpt on test dataset python eval.py ckpt_path="/path/to/ckpt/name.ckpt"
Create a sweep over hyperparameters python train.py -m datamodule.batch_size=32,64,128 model.lr=0.001,0.0005

🥑 result in 6 different combination
❤️ HPO with Optuna: python train.py -m hparams_search=mnist_optuna experiment=example We can define everything in a single config file
Execute all experiments from the folder configs/experiment/ through python train.py -m 'experiment=glob(*)'
Execute with multiple seed python train.py -m seed=1,2,3,4,5 trainer.deterministic=True logger=csv tags=["benchmark"]

Workflow

Steps

Write PyTorch Lightning module as src/models/mnist_module.py
Write PyTorch Lightning datamodule as src/datamodules/mnist_datamodule.py
Write experiment config.
Run training with command line as python src/train.py experiment=experiment_name.yaml

Experiment design

Acc VS Batch size

python train.py -m logger=csv datamodule.batch_size=16,32,64,128 tags=["batch_size_exp"]

Logs

Configuration is in configs/logger and run

python train.py logger=logger_name

Tests

pytest

pytest tests/test_train.py

Hyperparameter Search

Config file is in configs/hparams_search
Command line: python train.py -m hparams_search=mnist_optuna
Supported frameworks: Optuna, Ax, and Nevergrad
The optimization_results.yaml will be available under logs/task_name/multirun folder.

Q&A

Conditional Search space

tianshuocong/Lightning-Hydra

Lightning-Hydra

Introduction

Project Structure

Command Line

Workflow

Steps

Experiment design

Hyperparameter Search

Q&A