Customer Churn Prediction with Azure Machine Learning:
From Kaggle Dataset to Productionalized Model

This repository contains an end-to-end ML lifecycle demo using the Azure Maching Learning Studio.

Different features of the Azure Machine Learning Studio will be shown while working to solve a Kaggle challenge to predict customer churn. The Kaggle challenge can be found here: https://www.kaggle.com/blastchar/telco-customer-churn.

Prerequisites to rebuild the demo:

Azure subscription (with credits)
Some foundational Azure and Data Science knowledge

Demo Instructions

Step 1:

Create a resource group.

Step 2:

Create an Azure Machine Learning workspace.

Step 3:

Enter your Azure Machine Learning workspace by clicking "Launch Now".

Step 4:

Create a compute instance in your Azure Maching Learning workspace.

Step 5:

Open Jupyter Notebooks in your compute instance.

Step 6:

Enter the terminal, switch directories to your user directory and clone this repository.

Step 7:

Download the two data files from the "data" folder to your local machine.

Step 8:

Create an Azure Data Lake Gen2. You do this by creating a storage account that has hierarchical namespace enabled.

Enable hierarchical namespace:

Create a container called "raw" in your Azure Data Lake Gen2.

Create two directories:

2020/03/31
2020/04/01

Step 9:

Upload the two data files in the respective directories in your Azure Data Lake Gen2 (according to the date).

Step 10:

Register your Azure Data Lake Gen2 as a storage account (not as an Azure Data Lake Gen2) datastore in the Azure Machine Learning Workspace.

First retrieve your storage account key from the Azure Portal:

Step 11:

Step 12:

Install all necessary dependencies on your compute instance.

Step 13:

You can now run the notebooks. Specific explanations can be found as comments in the notebooks. You can omit running "03_customer_churn_train_decision_tree" and "04_customer_churn_train_automl" without affecting the downstream workflow.

sebastianbirk/customer-churn-prediction-azure-ml

Customer Churn Prediction with Azure Machine Learning: From Kaggle Dataset to Productionalized Model