How to train using an Azure ML pipeline, using the SDK

This project shows how to train a Fashion MNIST model using an Azure ML pipeline, and how to deploy it using an online managed endpoint. It demonstrates how to create Azure ML resources using the Python SDK, and it uses MLflow for tracking and model representation.

Blog post

To learn more about the code in this repo, check out the accompanying blog post: https://bea.stollnitz.com/blog/aml-pipeline/

Setup

You need to have an Azure subscription. You can get a free subscription to try it out.
Create a resource group.
Create a new machine learning workspace by following the "Create the workspace" section of the documentation. Keep in mind that you'll be creating a "machine learning workspace" Azure resource, not a "workspace" Azure resource, which is entirely different!
Install the Azure CLI by following the instructions in the documentation.
Install and activate the conda environment by executing the following commands:

conda env create -f environment.yml
conda activate aml_pipeline_sdk

Within VS Code, go to the Command Palette clicking "Ctrl + Shift + P," type "Python: Select Interpreter," and select the environment that matches the name of this project.
In a terminal window, log in to Azure by executing az login --use-device-code.
Add a config.json file to the root of your project (or somewhere in the parent folder hierarchy) containing your Azure subscription ID, resource group, and workspace:

{
    "subscription_id": "<YOUR_SUBSCRIPTION_ID>",
    "resource_group": "<YOUR_RESOURCE_GROUP>",
    "workspace_name": "<YOUR_WORKSPACE>"
}

You can now open the Azure Machine Learning studio, where you'll be able to see and manage all the machine learning resources we'll be creating.
Install the Azure Machine Learning extension for VS Code, and log in to it by clicking on "Azure" in the left-hand menu, and then clicking on "Sign in to Azure."

Training and inference on your development machine

Under "Run and Debug" on VS Code's left navigation, choose the "Train locally" run configuration and press F5.
Do the same for the "Test locally" run configuration.
You can analyze the metrics logged in the "mlruns" directory with the following command:

mlflow ui

Make a local prediction using the trained mlflow model. You can use either csv or json files:

cd aml_pipeline_sdk
mlflow models predict --model-uri "model" --input-path "test_data/images.csv" --content-type csv --env-manager local
mlflow models predict --model-uri "model" --input-path "test_data/images.json" --content-type json --env-manager local

Train and deploy in the cloud

Create and run the pipeline

Select the run configuration "Train in the cloud" and press F5 to train in the cloud.

Create and invoke the endpoint

Select the run configuration "Create endpoint" and press F5 to create an endpoint in the cloud and invoke it.

Clean up the endpoint

Once you're done working with the endpoint, you can clean it up to avoid getting charged by selecting the "Delete endpoint" run configuration and pressing F5.

bstollnitz/aml_pipeline_sdk