/CD4ML-Scenarios

Repository with sample code and instructions for "Continuous Intelligence" and "Continuous Delivery for Machine Learning: CD4ML" workshops

Primary LanguagePythonMIT LicenseMIT

Continuous Intelligence and CD4ML Workshop

This workshop contains the sample application and machine learning code used for the Continuous Delivery for Machine Learning (CD4ML) and Continuous Intelligence workshop.

This workshop is based on an existing CD4ML Workshop.

This material has been developed and is continuously evolved by ThoughtWorks and has been presented in conferences such as: ODSC Boston 2020, ODSC Europe 2020.

You can also watch a recording of this material presented at a Global Webinar.

Pre-Requisites

In order to run this workshop, you will need:

  • A valid Github account
  • A working Docker setup with at least 20 GB of space free (if running on Windows, make sure to use Linux containers)

Tools used in this workshop

As part of this workshop all of these service will be automatically setup for you as Docker containers. You do not need to download and install these services ahead of time.

Workshop Instructions

The workshop is divided into several steps, which build on top of each other. Instructions for each exercise and scenario can be found under the instructions folder. To start from the beginning click here.

The exercises build on top of each other, so you will not be able to skip steps ahead without executing them.

The Machine Learning Problems

In this workshop there are two different scenarios that you can perform.

The first is a simplified solution to a Kaggle problem posted by Corporación Favorita, a large Ecuadorian-based grocery retailer interested in improving their Sales Forecasting using data. For the purposes of this workshop, we have combined and simplified their data sets, as our goal is not to find the best predictions, but to demonstrate how to implement CD4ML.

The second is a scenario based on a problem from the Zillow group, an American online real estate company interested in improving there predications of real-estate prices.

Links to the different components of this scenario

After a successful setup of the environment, the following components are running on your machine. You can find a homepage to navigate to any of these services here

Collaborators

The material, ideas, and content developed for this workshop were contributions from (in alphabetical order):