MLOps - Devfest

The project is designed to for a talk at #devfestindia-2020

Topics to be convered

Data versioning
Building training pipelines
Versioning models
Deploying using docker and kubernetes

Recreate Steps for new project

Checkout demo steps here

Run this project

Pre-Requisites

Python 3.7
Pip

Setup the project

Clone the repo

git clone <repo_url>
cd <repo-name>

Pull the data from gdrive

dvc pull -r gdrive

It will ask for authorization, please enter the auth key from the URL

Install requirements

pip install -r requirements.txt

Run project

Run the application

python app/app.py

Update model/any ML step and run

dvc repro
dvc push -r gdrive

DVC Notes

How is the pipeline created?

dvc run -n preprocess -d src/preprocess.py -d assets/original_data/train.csv  -o assets/preprocessed/  python src/preprocess.python
dvc run -n featurize -d src/preprocess.py -d assets/preprocessed/train.csv -d assets/preprocessed/train.csv   -o assets/featurized/  python src/featurize.py 
dvc run -fn train_test_eval  -d src/model.py -d assets/featurized -p model.random,model.split  -o assets/models  -M assets/eval/scores.json  python src/model.py 
dvc run -fn train_test_eval  -d src/model.py -d assets/featurized -p model.random,model.split  -o assets/models  -M assets/eval/scores.json  python src/model.py

amansharma2910/mlops-with-dvc

MLOps - Devfest

Topics to be convered

Recreate Steps for new project

Run this project

Pre-Requisites

Setup the project

Run project

DVC Notes