Operationalizing Machine Learning with Azure ML

This project has the goal of demonstrating the end to end workflow of operationalizing a machine learning pipeline using azure ml studio.

Architectural Diagram

Key Steps

The key steps of this project are defined below:

1. Authentication

In order to work with the azure ml cli it is necessary to login to the azure accout and create a service principal that would have an owner role in the azure ml workspace.

Add a new service principal named `ml-auth` with the following command:

az ad sp create-for-rbac --sdk-auth --name ml-auth

Screenshot:

Get the `objectId` using the `clientId` from the previous command:

az ad sp show --id ac3638b4-3928-4a75-b4e5-131ec8887d04

Screenshot:

Share the workspace with the created service principal using the command:

az ml workspace share \ 
      -w <your-workspace-name> \ 
      -g <your-resurce-group-name> \ 
      --user <objectId> \ 
      --role owner

Screenshot:

2. Automated ML Experiment

After we have authenticated we will define an automated ml experiment. For that we need to:

Register the dataset

In order to have access to the data inside the azure ml studio a dataset must be registered. It is done in the dataset section of the ml studio. Azure ml studio offers different possibilities to register a dataset - from local files, local datastores, web files and even from open datasets. We will use From web files option:

Define the ML pipeline

To be able to trigger the pipeline from other CI/CD pipelines we need to have the pipeline published. Publishing the pipeline gives us a REST endpoint to interact with which has now the status of Ative:

When we click on the Pipeline runs tab we can see our submitted run:

For the purpose of this project the training pipeline will be fairly simple - it will contain only the dataset and the automl steps:

Pipeline structure:

Run the pipeline

We will run the pipeline to generate the ml model:

Get the best model:

The automl pipeline in particular does a number of runs to determine the best model architecture for us, all we have to do is to select the best run for further deployment:

3. Deploy the best model

This is the step that makes our model useful outside the studio and one step closer to the users.

To deploy a model from a run we need to go to the model tab:

Click the deploy button which will give us a published endpoint:

4. Enable logging

This step is a crucial step for any webservice that is intended for production. Logging offers insights and early warning signs that our service migh not be doing well. Logging is used as well for investigation of any incidents or failures.

Running the enable-ai.py script will enable logging for our model endpoint.

And we can se the logs from application insights:

5. Swagger documentation

The nice thing about ml studio is that the published endpoint comes with the swagger.json which is a way to document APIs that is highly human and machine readable.

Using this file we can consume the documentation via swagger ui:

and the response:

6. Consume model endpoints

Check endpoint.py script to see an example of consuming the REST endpoint.

Screen Recording

You can find a screen cast here: https://youtu.be/cCXFPUZrlMg

Further Improvements

In order to make this project ready for prime time I would investigate and add model versioning. This is required to allow for fallback in case the new model is faulty. I would add as well a benchmarking step to the pipeline to be able to asses the quality of the new model.

apollo2030/nd00333_AZMLND_C2

Operationalizing Machine Learning with Azure ML

Architectural Diagram

Key Steps

1. Authentication

Add a new service principal named ml-auth with the following command:

Get the objectId using the clientId from the previous command:

Share the workspace with the created service principal using the command:

2. Automated ML Experiment

Register the dataset

Define the ML pipeline

Pipeline structure:

Run the pipeline

Get the best model:

3. Deploy the best model

4. Enable logging

And we can se the logs from application insights:

5. Swagger documentation

Using this file we can consume the documentation via swagger ui:

and the response:

6. Consume model endpoints

Screen Recording

Further Improvements

Add a new service principal named `ml-auth` with the following command:

Get the `objectId` using the `clientId` from the previous command: