/mlops-r-gha

Primary LanguageRMIT LicenseMIT

MLOPS with R: An end-to-end process for building machine learning applications

This repository contains resources for the talk "MLOPS with R: An end-to-end process for building machine learning applications".

In addition to the slides (see below), this repository contains the complete set of code and GitHub Actions to deploy a Shiny application for calculating the probability of a fatal road accident. See below for instructions on how to deploy this application yourself.

Screenshot of Shiny app

Talk Abstract

As predictive models and machine learning become key components of production applications in every industry, an end-to-end Machine Learning Operations (MLOPS) process becomes critical for reliable and efficient deployment of applications that depend on R-based models. In this talk, I’ll outline the basics of the DevOps process and focus on the areas where MLOPS diverges. The talk will show the complete process of building and deploying an application driven by a machine learning model implemented with R. We will show the process of developing models, triggering model training on code changes, and triggering the CI/CD process for an application when a new version of a model is registered. We will use the Azure Machine Learning service and the “azuremlsdk” package to orchestrate the model training and management process, but the principles will apply to MLOPS processes generally, especially for applications that involve large amounts of data or require significant computing resources.

Presentations (Slides)

Aug 2020: New York R Conference (online).
MLOPS with R: An end-to-end process for building machine learning applications: slides (PDF) | Video Recording (forthcoming)

Resources

Links and other useful resources from the talk.

Azure Machine Learning service:

  • Documentation
  • Free azure credits: register here. (Credit card required, but won't be charged until you remove limits to allow it.)

azuremlsdk R package:

GitHub Actions:

Visual Studio Code:

Data file nassCDS.csv:

Related Presentations

Machine Learning Operations with R (January, 2020)

Application Architecture

The application runs as a Shiny app, running on an instance of the Azure Data Science VM. Azure ML service is used to train and deploy the scoring endpoint from R scripts, and GitHub Actions orchestrates the app deployment.

Architecture

Instructions for deploying the "Accident" app

  1. Fork this repository.

  2. Follow the directions in ML Ops with GitHub Actions and Azure Machine Learning to:

    • Create a resource group in your Azure subscription. (If you don't have one, create an Azure Free Subscription and get $200 in free Azure credits.)
    • Create a service principal
    • Add secrets to your forked repository
    • Configure the .cloud\.azure\workspace.json file. You can use an existing Azure ML Workspace, or if none by the specified name exists it will be created for you.
  3. Deploy an Azure Data Science Virtual Machine and configure it as the Shiny Server by following these instructions.