Note: this is a very experimental project, if you have any information and tips that can improve this, feel free to reach out and open an issue.
When creating machine learning projects we usually stumble upon a few very challenging problems:
- Data and Model Versioning
- Managing Experiments.
- Model Deployment
- Validation and Monitoring
This project is my attempt to solve at least the first three problems listed above.
This is done by DVC (Data Version Control).
The steps in which I used to create this DVC pipeline can be found on create_step.sh
DVC also takes care of the experiments. Which at least in my book is a huge win (one less system to learn and manage).
The way I usually approach this problem is creating several branches each modifying a few lines and parameters and running dvc repro
. Then comparing the results of that branch with other branches to see which will be promoted to main
.
The model is evaluated agains the metrics saved on metrics.json. If it detects an improvement it should save the model into serving, using that as a trigger to deployment