/MLOps

Learn how to create and run modular and minimalistic MLOps pipelines

Primary LanguagePythonMIT LicenseMIT

MLOps Recipes

Goal
To Create a library of modular recipes (parameterized devops pipeline templates) which could then be composed to create custom end to end CI/CD pipelines for Machine Learning

Why care? To sustain business benefits of Machine learning across any organization, we need to bring in discipline, automation & best practices. Enter MLOps.

Approach

  1. Minimalistic: Focus is on clean, understandable pipeline & code
  2. Modular: Atomic recipes that could be referred and reused (e.g recipe: Deploy to production after approval)

Status: Project board

Technologies: Azure Machine Learning & Azure Devops

Technical Aspects

  1. Fully CI/CD YAML based multistage pipeline (does not use classic release pipelines in Azure devops)
  2. Use YAML based variables template (no need to configure variable groups through UI)
  3. Gated releases (manual approvals)
  4. CLI based MLOps: use Azure ML CLI from Devops pipelines as a mechanism for interacting with the ML platform. Simple and clean.

Get Started

  1. Understand what we are trying to do (below section)
  2. Setup the environment
  3. Run an end to end MLOps pipeline

Note: Automated builds based on code/asset changes have been disabled by setting triggers: none in the pipelines. The reason is to avoid triggering accidental builds during your learning phase.

MLOps Flow MLOps Flow



Notes on our Base scenario:

  1. Directory Structure
    1. mlops contains the devops pipelines
      1. env_create_pipelines contains pipelines to provision all the components in the cloud (ML workspace, AKS cluster etc)
      2. model_pipelines contains individual pipelines for each of the models
      3. recipes: contains parameterized, reusable devops pipeline for different scenarios
    2. models directory has the source code for the individual models (training, scoring etc).
    3. setup directory contains documentation on usage
  2. Training: For training we use a simple LogisticRegression model on the German Credit card dataset. We build sklearn pipeline that does feature engineering. We export the whole pipeline as a the model binary (pkl file).
  3. We use Azure ML CLI as a mechanism for interacting with Azure ML due to simplicity reasons.

Acknowledgements

  1. MLOpsPython python repo was one of the inspirations for this - thanks to the contributors
  2. German Creditcard Dataset
    Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.