/customer-prediction

An example repo with MLFlow for tracking model training parameters and CI/CD for model training

Primary LanguageJupyter Notebook

Customer Experience Survey Prediction

A. Overview

This submission consists of the following sections:


B. Folder Structure

The following shows the folder structure of the repository.

  <base>
    ├── .env              # Required to use for mapbox api in Jupyter
    ├── dyson.ipynb       # Jupyter Notebook
    ├── conda-env.yaml    # conda env file for Jupyter Notebook
    ├── readme.md      
    ├── data
    │   └── cali_dyson_households.csv  # Provided data file
    ├── docker
    │   ├── ces.DockerFile             # dockerfile
    │   ├── docker-compose.yml         # docker-compose file
    │   └── requirements.txt           # env file for dockerfile
    ├── models
    │   └── lin_reg_pipe.pkl           # pickled file for use in frontend
    └── src
        └── app.py                     # frontend script

C. Instructions

Unless otherwise stated please use run the following commands on the command line at <base>. Please see folder structure for more information.

C.1. Running Jupyter Notebook

1.1 To run the other sections please use the conda-env.yaml file, using the command:

conda env create -f conda-env.yaml
conda activate ces

C.2. Running Front-End

2.1 To run streamlit via docker, please use the command:

docker compose -f docker/docker-compose.yml up -d

2.2 Once the image has been sucessfully created you can access the frontend on your browser at localhost:6006

2.3 To stop the container, please use the command:

docker-compose -f docker/docker-compose.yml down 

D. Model pipeline

To help better understand the overall flow of the ML pipeline please view the following to understand the Machine Learning process flow.

graph TD
    A[Preprocessing Pipeline] -- RandomSearchCV --> B(Histogram Gradient \n Boosting Regressor )
    A -- GridSearchCV --> C(Linear Regression)
    B -- Best_params--> D{Prediction \n on Test set}
    C -- Best_params--> D -- Final model choice --> E(Model in Production)
Loading