This repository tracks my work done as part of the MLops zoomcamp organised by DataTalksClub . The course's complete repository could be accessed on Github and the video playlist is available on Youtube.
Teach practical aspects of productionizing ML services — from collecting requirements to model deployment and monitoring.
Data scientists and ML engineers. Also software engineers and data engineers interested in learning about putting ML in production.
- Python
- Docker
- Being comfortable with command line
- Prior exposure to machine learning (at work or from other courses, e.g. from ML Zoomcamp)
- Prior programming experience (at least 1+ year)
Course Start: 16 May 2022
Course End:
- What is MLOps
- MLOps maturity model
- Running example: NY Taxi trips dataset
- Why do we need MLOps
- Course overview
- Environment preparation
- Homework
- Experiment tracking intro
- Getting started with MLflow
- Experiment tracking with MLflow
- Saving and loading models with MLflow
- Model registry
- MLflow in practice
- Homework
- ML Pipelines: introduction
- Prefect
- Turning a notebook into a pipeline
- Kubeflow Pipelines
- Homework
- Batch vs online
- For online: web services vs streaming
- Serving models in Batch mode
- Web services
- Streaming (Kinesis/SQS + AWS Lambda)
- Homework
- ML monitoring vs software monitoring
- Data quality monitoring
- Data drift / concept drift
- Batch vs real-time monitoring
- Tools: Evidently, Prometheus and Grafana
- Homework
- Devops
- Virtual environments and Docker
- Python: logging, linting
- Testing: unit, integration, regression
- CI/CD (github actions)
- Infrastructure as code (terraform, cloudformation)
- Cookiecutter
- Makefiles
- Homework
- CRISP-DM, CRISP-ML
- ML Canvas
- Data Landscape canvas
- MLOps Stack Canvas
- Documentation practices in ML projects (Model Cards Toolkit)
- End-to-end project with all the things above
To make it easier to connect different modules together, we’d like to use the same running example throughout the course.
- Possible candidates: https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page - predict the ride duration or if the driver is going to be tipped or not
- Larysa Visengeriyeva
- Cristian Martinez
- Kevin Kho
- Theofilos Papapanagiotou
- Alexey Grigorev
- Emeli Dral
- Sejal Vaidya