/mlops_project

an end-to-end mlops project

Primary LanguageJupyter NotebookMIT LicenseMIT

Dependabot

DVC

DVC

GitHub Contributors Image

MLOps

The MLOps is the process of continuous integration and continuous delivery of Machine Learning artifacts as a software product, keeping it inside a loop of Design, Model Development and Operations.

In this paradigm, teams can easily collaborate in models, with clear tracking of the data throughout the process of cleaning, processing, and feature creation. Automating every repetitive process avoids human error and reduces the delivery time, ensuring the team keeps focusing on the Business Problem.

Some benefits:

  • Versioning data and code, making models auditable and reproducible.

  • Automated tests and building ensuring quality functioning of artifacts and availability for the delivery pipelines.

  • Makes it easier and faster the deployment of new models by using an automated cycle.

The MLOps Project

The MLOps project is a path to learning how to implement a study case aiming to be testable and reproducible within the CI/CD methodology, using the best programming practices.

The scope of this project is delimited as you can see in the image below.

We will select the best tool to implement every step, integrate them, and build a Machine Learning Orchestrator. That said, in the end, new ML experiments will be easily made, and delivered as simple as typing a terminal command or clicking on a button!


This project is structured following the cookiecutter template and uses tools such as pre-commit to make it easier for other people to contribute.

You can understand the project organization here.

Prerequisites

For mlops_project to work correctly, first, you should install the prerequisites

Contributing

If you are interested in contributing to this project but you are a begginer on Github or never contributed to a project before, consider following this tutorial

If you are familiar with open source projects and have an idea of how to improve this project follow the links below

How to use?

If you are just interested in using this package, follow the steps below:

  1. Clone the repository

    Open a terminal (if you are using Windows, make sure of using the git bash) navigate to the desired destination folder and clone the repository,

    git clone https://github.com/Schots/mlops_project.git

    The Makefile on the root folder defines a set of functions needed to automate repetitive processes in this project. Type "make" in the terminal and see the available functions.


  1. Create an environment & Install requirements

    Create a Python virtual environment for the MLOps project on your local machine. Use any tool you desire. Activate the environment and install the requirements using make:

    make requirements
  2. Download data

    To download the raw dataset, use the get_data

    make data

    type the dataset name when prompted. The zip file with data will be downloaded and unzipped under the data/raw folder


Contributors

This project is the result of the collaboration of many people. Feel free to contact us on LinkedIn

Maykon Schots

Maike Reis

Bruno Messias

Elisa Ribeiro

Roberto Castaldelli

Jaime VinĂ­cius

Gustavo Bruschi

Leon Silva

Project based on the cookiecutter data science project template. #cookiecutterdatascience