Click on Use this template to start your own project! or go to the Documentation for more information.
A template for machine learning or deep learning projects.
- Easy to implement your own model and dataloader through hydra instantiation of datamodules and models
- Configurable hyperparameters with Hydra
- Logging with the solution that fits your needs
- Works on CPU, multi-GPU, and multi-TPUs
- Use bleeding edge UV to manage packages
- pre-commits hooks to validate code style and quality
- Hydra instantiation of models and dataloaders
- torch.compile of models
- Tensors typing validation with TorchTyping
- Dockerized project (Dockerfile, run tests and training through docker, optionally docker-compose)
- Examples of efficient multi-processing using python's pool map
- Examples using polars for faster and more efficient dataframe processing
- Example of mock tests using pytest
- Util scripts to download dataset from kaggle
- Cloud data retrieval using cloudpathlib (launch your training on AWS, GCP, Azure)
- Architecture and example of creating the model serving API through LitServe
- Wiki creation and setup of documentation website with best integrations through Mkdocs
- Use this repository as a template
- Clone your repository
- Run
make install
to install the dependencies - Add your model which inherits from
LightningModule
insrc/models
- Add your dataset which inherits from
Datamodule
insrc/data
- Add associated yaml configuration files in
configs/
folder following existing examples - Read the commands in the Makefile to understand the available commands you can use
The train.py
or eval.py
script is the entry point of the project. It uses Hydra to instantiate the model (LightningModule), dataloader (DataModule), and trainer using the configuration reconstructed using Hydra. The model is then trained or evaluated using Pytorch Lightning.
The serve.py
is used to serve the model through a REST API using LitServe and based on the configs/serve.yaml
configuration file.
You don't need to worry about implementing the training loops, the support for different hardwares, reading of configurations, etc. You need to care about 4 files for each training : your LightningModule (+ its hydra config), your DataModule (+ its hydra config).
In the LightningModule, you need to implement the following methods:
forward method
training_step
validation_step
test_step
Get inspired by the provided examples in the src/models
folder.
For the DataModule, you need to implement the following methods:
prepare_data
setup
train_dataloader
val_dataloader
test_dataloader
Get inspired by the provided examples in the src/data
folder.
Get to know more about Pytorch Lightning's LightningModule and DataModule in the Pytorch Lightning documentation. Finally in the associated configs/ folder, you need to implement the yaml configuration files for the model and dataloader.
As Hydra is used for configuration, you can easily change the hyperparameters of your model, the dataloader, the trainer, etc. by changing the yaml configuration files in the configs/
folder. You can also use the --multirun
option to run multiple experiments with different configurations.
But also, as it used to instantiate the model and dataloader, you can easily change the model, dataloader, or any other component by changing the yaml configuration files or DIRECTLY IN COMMAND LINE. This is especially useful when you want to use different models or dataloaders.
For example, you can run the following command to train a model with a different architecture, changing the dataset used, and the trainer used:
uv run src/train.py model=LeNet datamodule=MNISTDataModule trainer=gpu
Read more about Hydra in the official documentation.
- Typing your functions and classes with
TorchTyping
for better type checking (in addition to python's typing module) - Docstring your functions and classes, it is even more important as it is used to generate the documentation with Mkdocs
- Use the
make
commands to run your code, it is easier and faster than writing the full command (and check the Makefile for all available commands π) - Use the pre-commit hooks to ensure your code is formatted correctly and is of good quality
- UV is powerful (multi-thread, package graph solving, rust backend, etc.) use it as much as you can.
- If you have a lot of data, use Polars for faster and more efficient dataframe processing.
- If you have CPU intensive tasks, use multi-processing with python's pool map, you can find an example in the
src/utils/utils.py
file.
You have the possibility to generate a documentation website using Mkdocs. It will automatically generate the documentation based on both the markdown files in the docs/
folder and the docstrings in your code.
To generate and serve the documentation locally:
make serve-docs # Documentation will be available at http://localhost:8000
And to deploy it to Github pages (youn need to enable Pages in your repository configuration and set it to use the gh-pages branch):
make pages-deploy # It will create a gh-pages branch and push the documentation to it
This repository uses Github templates to help you with issues, pull requests, and discussions. It is a great way to standardize the way your team interacts with the repository. You can customize the templates to fit your needs. They can be find in .github folder.
This template is perfect for your junior's on-boarding process. It has all the best practices and tools to make them productive from day one. It is also a great way to ensure that your team follows the same best practices and tools. For example you can select as a start a training notebook for any dataset on Kaggle, and ask your junior to industrialize the notebook into a full-fledged project. It will help them to understand the best practices and tools used in the industry. After selecting the dataset and notebook, potential steps for the junior can be:
- Implement the DataModule and the LightningModule
- Implement the associated yaml configuration files and use Hydra to instantiate important classes
- Implement the training script
- Implement the evaluation script
- Implement unit tests
- Create a CI/CD pipeline with Github Actions
- Dockerize the project
- Create a Makefile with useful commands
- Implement the documentation with Mkdocs (All of this while following the best practices and tools provided in the template and PEP8)
If any struggle is encountered, the junior can refer to the provided examples in the project.
.
βββ commit-template.txt # use this file to set your commit message template, with make configure-commit template
βββ configs # configuration files for hydra
β βββ callbacks # configuration files for callbacks
β βββ data # configuration files for datamodules
β βββ debug # configuration files for pytorch lightning debuggers
β βββ eval.yaml # configuration file for evaluation
β βββ experiment # configuration files for experiments
β βββ extras # configuration files for extra components
β βββ hparams_search # configuration files for hyperparameters search
β βββ local # configuration files for local training
β βββ logger # configuration files for loggers (neptune, wandb, etc.)
β βββ model # configuration files for models (LightningModule)
β βββ paths # configuration files for paths
β βββ trainer # configuration files for trainers (cpu, gpu, tpu)
β βββ train.yaml # configuration file for training
βββ data # data folder (to store potentially downloaded datasets)
βββ Makefile # makefile contains useful commands for the project
βββ notebooks # notebooks folder
βββ pyproject.toml # pyproject.toml file for uv package manager
βββ README.md # this file
βββ ruff.toml # ruff.toml file for pre-commit
βββ scripts # scripts folder
β βββ example_train.sh
βββ src # source code folder
β βββ data # datamodules folder
β β βββ components
β β βββ mnist_datamodule.py
β βββ eval.py # evaluation entry script
β βββ models # models folder (LightningModule)
β β βββ components # components folder, contains model parts or "nets"
β βββ train.py # training entry script
β βββ utils # utils folder
β βββ instantiators.py # instantiators for models and dataloaders
β βββ logging_utils.py # logger utils
β βββ pylogger.py # multi-process and multi-gpu safe logging
β βββ rich_utils.py # rich utils
β βββ utils.py # general utils like multi-processing, etc.
βββ tests # tests folder
βββ conftest.py # fixtures for tests
βββ mock_test.py # example of mocking tests
For more information on how to contribute to this project, please refer to the CONTRIBUTING.md file.
This template was heavily inspired by great existing ones, like:
But with a few opininated changes and improvements, go check them out!