When you are designing a machine learning system, your job doesn't end with building the model—and achieving a high accuracy score and a low validation error! For the model to be actually helpful—you will have to consider deploying it, and also ensure that the model's performance does not degrade over time. And MLOps is a set of best practices for putting machine learning models into production.
The various stages in a machine learning project can be broadly captured in the following three steps:
- Design: In the
design
step, you are considering the problem at hand—to decide whether or not you'll need a machine learning algorithm to achieve the objective. - Train: Once you decide on using a machine learning algorithm, you
train
the model and optimize its performance on the validation dataset. - Operate: The
operate
state captures the performance of the model after it's deployed. Some of the questions that you'll answer throughout the course, include:
- If the performance of the model degrades, can you retrain the model in a cost-effective manner?
- How do you ensure that the deployed model performs as expected—that is, how do you monitor the model's performance in production?
- What are the challenges associated with monitoring ML models?
When machine learning (ML) is used to solve a business problem, one could argue that delivering the model output to the end-user in a reliable way is an integral part of the machine learning process.
For a clearer picture, we can look at a life-cycle of a machine learning model:
flowchart LR
subgraph ML Model
D([Monitor]) -.->|retrain when performance drops| B([1. Train])
B --> C([2. Deploy])
C --> D([3. Monitor])
end
A([Data & <br>Problem Design]) --->|if ML helps solve problem| B
- Data & Problem Design: Choose machine learning to solve a problem when there is no other more straightforward approach and the data has sufficient quality.
- 1️⃣ Train: Train and evaluate ML models and choose the best performing one.
- 2️⃣ Deploy: Integrate the chosen model into the production environment (web service, module, embedded system, etc.)
- 3️⃣ Monitor: Capture the model's performance in the production environment and define a threshold for an acceptable value.
Depending on the use case, team skills, and established best practices, each of the life-cycle stages could be realized manually or with more automation support. MLOps is a practice that could support the maturity of the life-cycle iterations.
MLOps brings DevOps principles to machine learning. It is a set of best practices to put ML models into production and automate the life-cycle.
MLOps could help to
- track model iterations and reproduce results reliably,
- monitor model performance and deliver domain-specific metrics for ML,
- and deliver models safely and automatically into production.
The extent to which MLOps is implemented into a team or organization could be expressed as maturity. A framework for classifying different levels of MLOps maturity is listed below:
Lvl | Overview | Use Case | |
---|---|---|---|
0️⃣ | No MLOps |
|
|
1️⃣ | DevOps but no MLOps |
|
|
2️⃣ | Automated Training |
|
|
3️⃣ | Automated Model Deployment |
|
|
4️⃣ | Full MLOps Automated Operations |
|
|
A high maturity level is not always needed because it comes with additional costs. The trade-off between automated model maintenance and the required effort to set up the automation should be considered. An ideal maturity level could be picked based on the use case / SLAs and the number of models deployed.
If you want to read more on maturity, visit Microsoft's MLOps maturity model.