This project is a demonstration of how to use FastAPI to create a REST API for a machine learning model. The project uses DVC to manage the project dependencies and allow for a reproducible ML pipeline.
Python, DVC, scikit-learn, pandas, FastAPI and Conda
The DVC pipeline can also be viewed in the terminal with the command:
dvc dag
Output
+-----------------------------+
| starter/data/census.csv.dvc |
+-----------------------------+
*
*
*
+------------+
| clean_data |
+------------+
*
*
*
+-------------+
| train_model |
+-------------+
*
*
*
+----------+
| evaluate |
+----------+
The Makefile
contains the commands to set up the environment for the project.
This will create a conda environment and install the dependencies.
If you prefer pip
to install the dependencies, you can use the requirements.txt
file.
The commands bellow are more or less the same as the ones used to create the project. They are note important to clone and run the project.
Initialize and start using dvc inside git repository.
dvc init
Start to track the UCI census data file.
dvc add starter/data/census.csv
Store file in AWS S3 bucket
dvc remote add -d storage s3://<name-of-s3-bucket>
Tell dvc to use the AWS profile named udacity, instead of the default profile.
dvc remote modify storage profile udacity
Run the clean_data.py
script
dvc run -n clean_data -d starter/data/census.csv -d starter/starter/clean_data.py -o starter/data/census_clean.csv --no-exec python starter/starter/clean_data.py
DVC pipeline can be run with dvc repro
command.
Create Heroku application
heroku create marcus-census-fastapi --buildpack heroku/python
set git remote heroku to https://git.heroku.com/marcus-census-fastapi.git
heroku git:remote --app marcus-census-fastapi
Add extra buildpack layer for DVC, also see Aptfile
heroku buildpacks:add --index 1 heroku-community/apt
Run git push heroku main to create a new release using these buildpacks.
git push heroku main
Add AWS configuration keys
heroku config:set AWS_ACCESS_KEY_ID=xxx AWS_SECRET_ACCESS_KEY=yyy
DVC on Heroku article by Andrew Kane.