This repository demoes CI/CD automation for ML. The project uses dvc to build and test a model in a reproducible way.
- Fork and clone this repository.
- Install dvc on your machine. Check you have Python installed.
- Create a virtualenv:
$ python -m venv venv $ source venv/bin/activate
- Install dependencies:
$ pip install -r requirements.txt
To download the training dataset and train the model run:
$ dvc repro
This will create the model file in the models
directory
You can try the model's inference with:
python scripts/predict.py
The metrics
should now contain benchmarks of the model in plot/png and CSV formats.
The DVC pipeline contains the following stages:
- download: download traininig dataset
- validate: validate dataset
- prepare: prepare dataset
- train: train the model
- evaluate: evaluate model
- metrics: calculate model error
The project includes an Flask API server. You can start it with
$ cd api
$ python server.py
You can build a Docker image with the model and API server with:
$ docker build -t california-housing .
Try running the build image with:
$ docker run -it -p 8080:8080 california-housing
To create the app for the first time:
- Create a free Fly.io account.
- Install flyctl
- Run
fly launch
Once you have launched the application once, you can setup Semaphore to do continuous delivery:
- Open your Fly.io account and create an access token.
- Create the secret
fly-deploy
with the env variableFLY_TOKEN
- Edit the CI/CD Pipeline. Go to the last pipeline and add your secret to the block.
- Save the pipeline.