mlops-zoomcamp

This repository tracks my work done as part of the MLops zoomcamp organised by DataTalksClub . The course's complete repository could be accessed on Github and the video playlist is available on Youtube.

Overview

Teach practical aspects of productionizing ML services — from collecting requirements to model deployment and monitoring.

Target Audience

Data scientists and ML engineers. Also software engineers and data engineers interested in learning about putting ML in production.

Pre-requisites

Python
Docker
Being comfortable with command line
Prior exposure to machine learning (at work or from other courses, e.g. from ML Zoomcamp)
Prior programming experience (at least 1+ year)

Timeline

Course Start: 16 May 2022

Course End:

Syllabus

Module 1: Introduction

What is MLOps
MLOps maturity model
Running example: NY Taxi trips dataset
Why do we need MLOps
Course overview
Environment preparation
Homework

Module 2: Experiment tracking

Experiment tracking intro
Getting started with MLflow
Experiment tracking with MLflow
Saving and loading models with MLflow
Model registry
MLflow in practice
Homework

Module 3: Orchestration and ML Pipelines

ML Pipelines: introduction
Prefect
Turning a notebook into a pipeline
Kubeflow Pipelines
Homework

Module 4: Model Deployment

Batch vs online
For online: web services vs streaming
Serving models in Batch mode
Web services
Streaming (Kinesis/SQS + AWS Lambda)
Homework

Module 5: Model Monitoring

ML monitoring vs software monitoring
Data quality monitoring
Data drift / concept drift
Batch vs real-time monitoring
Tools: Evidently, Prometheus and Grafana
Homework

Module 6: Best Practices

Devops
Virtual environments and Docker
Python: logging, linting
Testing: unit, integration, regression
CI/CD (github actions)
Infrastructure as code (terraform, cloudformation)
Cookiecutter
Makefiles
Homework

Module 7: Processes

CRISP-DM, CRISP-ML
ML Canvas
Data Landscape canvas
MLOps Stack Canvas
Documentation practices in ML projects (Model Cards Toolkit)

Project

End-to-end project with all the things above

Running example

To make it easier to connect different modules together, we’d like to use the same running example throughout the course.

Possible candidates: https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page - predict the ride duration or if the driver is going to be tipped or not

Instructors

Larysa Visengeriyeva
Cristian Martinez
Kevin Kho
Theofilos Papapanagiotou
Alexey Grigorev
Emeli Dral
Sejal Vaidya

tanmaybhardwaj/mlops-zoomcamp