This repository is for the Sales Prediction API given the Rossman dataset. Here are the relevant files in this repository:
- Processing and Model Development - Main notebook used to process the data and train a model
- App.py - Contains the API
- pred_model2.joblib - Contains the pickled trained model; referenced in App.py
- preprocess.py - Contains the preprocessing functions that each new POST request undergoes; referenced in App.py
- Store.csv - Contains data of the 1,115 Rossman Stores; referenced in App.py
- Procfile, setup.sh, requirements.txt - Files needed to deploy the API on Heroku
To develop this API, X steps were performed. These are explained in detail in the Preprocess and Model Development notebook.
- Data cleaning - null values, encoding, data type formatting
- Feature Engineering and Reduction - Generated 4 new features and performed PCA to determine the relevant features
- Model Selection - Tested on 5 models with adjusted R square and RMSE as the main metrics
- Pickle and Export - Exported the selected trained model (Catboost) and preprocessing functions
- API Development - A simple API was developed to have an initial landing page and a /predict link that takes in a JSON post request and outputs the predicted Sale
- Deployment - The API was deployed to Heroku with the following link https://mynt-hbalcera.herokuapp.com/predict; testing was done using Postman
For any inquiries, kindly reach out to heidemlbalcera@gmail.com or https://www.linkedin.com/in/heidemae-balcera-sci/ This project is part of an application to Mynt.