/Lightning-Hydra

Let's learn Lightning-Hydra !

Primary LanguagePython

Lightning-Hydra

🚡 Let's learn how to use lightning-hydra-template!

Introduction

  • PyTorch Lightning: a lightweight PyTorch wrapper for high-performance AI research. Think of it as a framework for organizing your PyTorch code.

  • Hydra: a framework for elegantly configuring complex applications. The key feature is the ability to dynamically create a hierarchical configuration by composition and override it through config files and the command line.

Project Structure

├── configs                   <- Hydra configuration files
│   ├── callbacks                <- Callbacks configs
│   ├── datamodule               <- Datamodule configs
│   ├── debug                    <- Debugging configs
│   ├── experiment               <- Experiment configs
│   ├── extras                   <- Extra utilities configs
│   ├── hparams_search           <- Hyperparameter search configs
│   ├── hydra                    <- Hydra configs
│   ├── local                    <- Local configs
│   ├── logger                   <- Logger configs
│   ├── model                    <- Model configs
│   ├── paths                    <- Project paths configs
│   ├── trainer                  <- Trainer configs
│   │
│   ├── eval.yaml             <- Main config for evaluation
│   └── train.yaml            <- Main config for training
│
├── data                   <- Project data
│
├── logs                   <- Logs generated by hydra and lightning loggers
│
├── notebooks              <- Jupyter notebooks. Naming convention is a number (for ordering),
│                             the creator's initials, and a short `-` delimited description,
│                             e.g. `1.0-jqp-initial-data-exploration.ipynb`.
│
├── scripts                <- Shell scripts
│
├── src                    <- Source code
│   ├── datamodules              <- Lightning datamodules
│   ├── models                   <- Lightning models
│   ├── utils                    <- Utility scripts
│   │
│   ├── eval.py                  <- Run evaluation
│   └── train.py                 <- Run training
│
├── tests                  <- Tests of any kind
│
├── .env.example              <- Example of file for storing private environment variables
├── .gitignore                <- List of files ignored by git
├── .pre-commit-config.yaml   <- Configuration of pre-commit hooks for code formatting
├── Makefile                  <- Makefile with commands like `make train` or `make test`
├── pyproject.toml            <- Configuration options for testing and linting
├── requirements.txt          <- File for installing python dependencies
├── setup.py                  <- File for installing project as a package
└── README.md

Command Line

  • Config parameter python train.py trainer.max_epochs=20 model.optimizer.lr=1e-4

  • Add new parameters python train.py +model.new_param="owo"

  • Training device python train.py trainer=gpu

  • Train with mixed precision python train.py trainer=gpu +trainer.precision=16

  • Train model with configs/experiment/example.yaml python train.py experiment=example

  • Resume ckpt python train.py ckpt_path="/path/to/ckpt/name.ckpt"

  • Evaluate ckpt on test dataset python eval.py ckpt_path="/path/to/ckpt/name.ckpt"

  • Create a sweep over hyperparameters python train.py -m datamodule.batch_size=32,64,128 model.lr=0.001,0.0005

    🥑 result in 6 different combination

  • ❤️ HPO with Optuna: python train.py -m hparams_search=mnist_optuna experiment=example We can define everything in a single config file

  • Execute all experiments from the folder configs/experiment/ through python train.py -m 'experiment=glob(*)'

  • Execute with multiple seed python train.py -m seed=1,2,3,4,5 trainer.deterministic=True logger=csv tags=["benchmark"]

Workflow

Steps

  1. Write PyTorch Lightning module as src/models/mnist_module.py

  2. Write PyTorch Lightning datamodule as src/datamodules/mnist_datamodule.py

  3. Write experiment config.

  4. Run training with command line as python src/train.py experiment=experiment_name.yaml

Experiment design

  1. Acc VS Batch size
python train.py -m logger=csv datamodule.batch_size=16,32,64,128 tags=["batch_size_exp"]
  1. Logs

Configuration is in configs/logger and run

python train.py logger=logger_name
  1. Tests
pytest

pytest tests/test_train.py

Hyperparameter Search

  1. Config file is in configs/hparams_search

  2. Command line: python train.py -m hparams_search=mnist_optuna

  3. Supported frameworks: Optuna, Ax, and Nevergrad

  4. The optimization_results.yaml will be available under logs/task_name/multirun folder.

Q&A