How to train and deploy in Azure ML

This project shows how to train a Fashion MNIST model with an Azure ML job, and how to deploy it using an online managed endpoint. It uses MLflow for tracking and model representation.

Azure setup

You need to have an Azure subscription. You can get a free subscription to try it out.
Create a resource group.
Create a new machine learning workspace by following the "Create the workspace" section of the documentation. Keep in mind that you'll be creating a "machine learning workspace" Azure resource, not a "workspace" Azure resource, which is entirely different!
If you have access to GitHub Codespaces, click on the "Code" button in this GitHub repo, select the "Codespaces" tab, and then click on "New codespace."
Alternatively, if you plan to use your local machine:
- Install the Azure CLI by following the instructions in the documentation.
- Install the ML extension to the Azure CLI by following the "Installation" section of the documentation.
In a terminal window, login to Azure by executing az login --use-device-code.
Set your default subscription by executing az account set -s "<YOUR_SUBSCRIPTION_NAME_OR_ID>". You can verify your default subscription by executing az account show, or by looking at ~/.azure/azureProfile.json.
Set your default resource group and workspace by executing az configure --defaults group="<YOUR_RESOURCE_GROUP>" workspace="<YOUR_WORKSPACE>". You can verify your defaults by executing az configure --list-defaults or by looking at ~/.azure/config.
You can now open the Azure Machine Learning studio, where you'll be able to see and manage all the machine learning resources we'll be creating.
Although not essential to run the code in this post, I highly recommend installing the Azure Machine Learning extension for VS Code.

Project setup

If you have access to GitHub Codespaces, click on the "Code" button in this GitHub repo, select the "Codespaces" tab, and then click on "New codespace."

Alternatively, you can set up your local machine using the following steps.

Install conda environment:

conda env create -f environment.yml

Activate conda environment:

conda activate aml_command_artifact

Train and predict locally

Run train.py by pressing F5. This saves the trained model as an MLflow artifact.
Analyze the metrics logged in the "mlruns" directory with the following command:

mlflow ui

Set the model_uri environment variable to the model you want to use in prediction. You can use the model created in the latest train run, which train.py prints. Or you can choose one from the mlflow UI. For example:

model_uri=runs:/4dff763fdab946eba83f469618544604/model_artifact

Make a local prediction using the trained MLflow model. You can use either csv or json files:

mlflow models predict --model-uri $model_uri --input-path "aml_command_artifact/test_data/images.csv" --content-type csv
mlflow models predict --model-uri $model_uri --input-path "aml_command_artifact/test_data/images.json" --content-type json

Train and deploy in the cloud

cd aml_command_artifact

Create the compute cluster.

az ml compute create -f cloud/cluster-cpu.yml

Create the dataset.

az ml data create -f cloud/data.yml

Run the training job.

run_id=$(az ml job create -f cloud/job.yml --query name -o tsv)

Go to the Azure ML Studio and wait until the Job completes. You don't need to download the trained model, but here's how you would do it if you wanted to:

az ml job download --name $run_id

Create the Azure ML model from the trained model saved as an artifact.

az ml model create --name model-command-artifact --version 1 --path runs:/$run_id/model_artifact --type mlflow_model

Create the endpoint.

az ml online-endpoint create -f cloud/endpoint.yml
az ml online-deployment create -f cloud/deployment.yml --all-traffic

Invoke the endpoint.

az ml online-endpoint invoke --name endpoint-command-artifact --request-file test_data/images_azureml.json

When you're done, delete the endpoint to avoid getting charged.

az ml online-endpoint delete --name endpoint-command-artifact -y

bstollnitz/aml_command_artifact

How to train and deploy in Azure ML

Azure setup

Project setup

Train and predict locally

Train and deploy in the cloud