Application code and supporting material for the Codemotion 2020 workshop titled "From 0Ops to MLOps" (slides) held on 22/10.
The project focuses on incrementally improving the supporting MLOps infrastructure of a simple ML-powered application.
Starting from the most basic config (0Ops), we gradually "decorate" the app over a series of Milestones with standard CI/CD workflows to arrive at a setup that enables basic ML auditability, reproducibility, and collaboration.
In this project we take advantage of basic functionality of state-of-the-art tools such as:
- Streamlit, for easily creating a frontend for our app
- Heroku for easily deploying our Streamlit app to the world
- MLflow (Tracking) for enabling ML artifact logging and experiment tracking
- AWS EC2 for hosting the MLflow server that logs details of our experiments
- AWS S3 for storing the trained ML artifacts used by our application
- Github Actions for creating CI/CD workflows that combine everything together and achieving "MLOps"
- Fork this repo
- Clone your forked repo:
git clone git@github.com:<your-gh-name>/mlops_tutorial
- Create virtual environment:
conda create --name mlops_tutorial python=3.8.5
- Activate virtual environment:
conda activate mlops_tutorial
- Install all dependencies:
pip install -r requirements.txt
- Make sure all files under
.github/workflow
have file extension.disabled
. if there is a*.yml
, rename it to*.diabled
- Make sure
ARTIFACT_LOCATION='local'
inDockerfile
andtrain.Dockerfile
- My app deployed on Heroku: https://polar-oasis-38285.herokuapp.com/
- The Streamlit app is
app.py
. The app has been dockerised withDockerfile
- The ML model training script producing artefacts is
train.py
. It has also been dockerised withtrain.Dockerfile
- The
ARTIFACT_LOCATION
parameter in both app and training Dockerfiles controls the various stages of the workshop:- 0Ops stage:
ARTIFACT_LOCATION='local
- AlmostOps stage:
ARTIFACT_LOCATION='s3'
- MLOps stage:
ARTIFACT_LOCATION='s3_mlflow
- 0Ops stage:
- S3 bucket: https://s3.console.aws.amazon.com/s3/buckets/workshop-mlflow-artifacts/?region=eu-west-2&tab=overview
- MLflow server: http://ec2-18-134-150-82.eu-west-2.compute.amazonaws.com/
In this section we will
- Deploy the ML-powered application to the world without "Ops" of any kind.
For the purposes of this tutorial we will use a very simple application that uses an ML model to predict the sentiment of a user-provided movie review.
The application itself is a slightly modified version of the galleries/sentiment_analyzer
example Streamlit app found here awesome-streamlit.
Make sure you have a Heroku account and installed cli.
First, we will specify to the app we are running in 0Ops mode:
- Set
ENV ARTIFACT_LOCATION='local'
inDockerfile
andtrain.Dockerfile
Secondly, we will use heroku cli to build and deploy the application Docker container to Heroku ), and to the world!
- Steps:
- Log in to cli:
heroku login -i
- Log in to Container Registry:
heroku container:login
- Create app:
heroku create
. Take a note of the name of the created app! - Add secrets to your Github repo (repo/settings/secrets). We will need this later.
HEROKU_EMAIL
: the email you use on HerokuHEROKU_APP_NAME
: the output of step 3: e.g. polar-oasis-12478HEROKU_API_KEY
: get from here
- Build container and push to Heroku Container Registry. This will take a while!:
heroku container:push web
- Release uploaded container to app:
heroku container:release web
- See public app in browser:
heroku open
Optional
You may want to run the training script and the application on your machine.
Training:
- Build container:
docker build -f train.Dockerfile -t mlops_tutorial_train .
- Run training:
docker run -it mlops_tutorial_train
- Or just run with Python:
python train.py local
App:
- Build container:
docker build -f Dockerfile -t mlops_tutorial .
- Run container:
docker run -e PORT=8501 -it mlops_tutorial
- Or just run with Python:
streamlit run app.py local
If you want to run on your machine in the subsequent stages, you need to modify the above commands to include some environment variables as build args.
Se comments at the bottom of Dockerfiles for more info.
In this section, we will
- Enable loading of artifacts from S3
- Enable Continuous Deployment of the application, using a Github Action triggered upon any push to master.
For the purposes of the workshop we will use my S3 bucket, in order to mitigate issues with setting up.
Some setup first!
You need permissions to read and write to the workshop S3 bucket:
- Create a file named
.env
- Add the following lines:
export AWS_ACCESS_KEY_ID=<the key I send you on Discord>
export AWS_SECRET_ACCESS_KEY=<the key I send you on Discord>
- Run
source .env
in your terminal
- Also, add the above AWS credentials as secrets in Github (repo/settings/secrets), which we will need later.
In this stage, we will instruct the app to load artefacts from S3, rather than its local environment.
The two changes are:
- Make training script write artefacts to dedicated subdirectory for each participant in the workshop's S3 bucket.
- In
config.py
set theConfig
class attributeUSER
to something unique; e.g. your Github handle - In
train.Dockerfile
setARTIFACT_LOCATION='s3
- Build training container:
docker build -f train.Dockerfile -t mlops_tutorial_train .
- Run training container:
docker run -e AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID -e AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY -it mlops_tutorial_train
- If you have issues setting environment variables, hardcode the credentials as arguments to the run command
- In
- Instruct app to load artefacts from S3, rather than the local environment
- In
Dockerfile
setARTIFACT_LOCATION=s3
- In
This enables continuous deployment for the app with Github Actions, triggered on master branch push.
- Rename
deploy_app.disabled
todeploy_app.yml
in.github/workflows/
- Change the email field to your own
- Commit all your changes to git and push to master
git add .
git commit -m 'Milestone 3'
git push
The Github Action knows to deploy the right Heroku app, because of the secrets we added to Github HEROKU_APP_NAME
and HEROKU_API_KEY
earlier.
Now, the app will be automatically redeployed whenever you modify the master branch!
In the final section, we
- Introduce experiment tracking with MLflow Tracking server, deployed on an EC2 instance
- Enable model training to take place automatically within a CI/CD Github Actions flow, rather than manually
- Finally, we use Github Actions to create a cool pull request workflow for updating the models!
Here, we will leverage simple "decorations" in the application and training jobs to achieve MLflow instrumentation.
-
Configure application to communicate with MLflow server
- Add to .env file a new variable:
MLFLOW_TRACKING_URI=http://testuser:test@ec2-18-134-150-82.eu-west-2.compute.amazonaws.com/
- Run
source .env
in terminal - Also, add
MLFLOW_TRACKING_URI
as a secret in Github (repo/settings/secrets) - Run
mlflow_setup.py
, note your experiment_id and overwrite the existing value inconfig.py
- In
Dockerfile
setARTIFACT_LOCATION='s3_mlflow'
- In
train.Dockerfile
setARTIFACT_LOCATION='s3_mlflow'
- Add to .env file a new variable:
-
Run a training job to register your model with MLflow:
-
python train.py s3_mlflow --production-ready
-
or use Docker: In train.Dockerfile set
ENV PRODUCTION_READY='--production-ready'
-
build and run train.Dockerfile container
-
Go to the MLflow server and be excited!
-
-
Commit and push to master, wait for the automated deployment and check out app!
git add.
git commit -m "Milestone 4"
git push
This Github Action has been set to be triggered from a PR comment, but we could also have chosen it to be triggered by push to master.
We will see it in action in the next milestone.
For now, all we have to do is to:
- Rename
evaluate.disabled
toevaluate.yml
anddeploy_app.disabled
todeploy_app.yml
- Commit and push to master:
git add .
git commit -m "Milestone 5
git push
Now we get to see the workflow in action!!
- Create a new branch:
git checkout -b test-ml-pr
- Update any model config param in
train.py
; e.g.alpha=0.9
- Open a pull request against this branch
- Enter
/evaluate
in the PR chat and see magic starting to happen - Check out the model results
- Enter
/deploy-candidate
in the PR chat and wait for more magic to happen - Now, merge the PR to redeploy Heroku app
- Wait for the action to complete and checkout the app!
-
Launch EC2 instance
- Create IAM role EC2 with S3 access
- Amazon Linux AMI 2018.03.0 (HVM), SSD Volume Type - ami-0765d48d7e15beb93
- In configure instance specify the IAM role you created
- In security group create:
- HTTP: source 0.0.0.0/0, ::/0
- SSH: source 0.0.0.0/0
- Create key/value pair and save pem file to the repo directory (it is gitignored)
- Run
chmod 400 <your_key>.pem
-
Install MLflow on EC2
- From link
- From repo directory ssh into EC2 instance:
ssh -i "<your_key>.pem" ec2-user@ec2<your-instance>
- Install MLflow:
sudo pip install mlflow
- Downgrade dateutil (LOL):
sudo pip install -U python-dateutil==2.6.1
- Install boto3:
sudo pip install boto3
-
Configure nginx
- Install nginx:
sudo yum install nginx
- Start nginx:
sudo service nginx start
- Install httpd tools to allow password protection:
sudo yum install httpd-tools
- Create password for user testuser:
sudo htpasswd -c /etc/nginx/.htpasswd testuser
- Enable global read/write permissions to nginx directory:
sudo chmod 777 /etc/nginx
- Delete nginx.conf so we replace it with a modified one:
rm /etc/nginx/nginx.conf
- Open new terminal window and upload the nginx.conf file in this repo to EC2:
scp -i <your_key>.pem nginx.conf ec2-user@ec2<your-instance>:/etc/nginx/
- Reload nginx:
sudo service nginx reload
- Install nginx:
-
Run MLflow server
- Start the server:
mlflow server --default-artifact-root s3://<your-s3-bucket> --host 0.0.0.0
- Check it out! Open browser and go to your instance.
- Start the server: