/In-Vehicle-Coupon-Recommendation-Project

Generate predictive model using supervised learning method to enhanced coupon acceptance rate using python.

Primary LanguageJupyter Notebook

In-Vehicle Coupon Recommendation Project

We are a team of 7 (you could see on collaborators list) who worked together to build an enhanced predictive model for our dataset as Rakamin Data Science Bootcamp Final Project. We gathered dataset from here, if you curious more about the datasets please kindly click the link. Our main objective on this project is build an enhanced predictive model for coupon recommendation as problem business solving on coupon acceptance rate. Our project workflow consisted as 4 stages, you could see the summary about it below:

Stage 0 - Project Background

Stage 0 is an early stage where we implemented 'ask' in data life cycle. There are details about our role, problem statement, goal, objective and business metrics of our project.

Stage 1 - Exploratory Data Analysis

Stage 1 is a next step that we focused on gathering insights from statistical views.

What we have done on this stage:

  • Descriptive Analysis
  • Univariate Analysis
  • Multivariate Analysis
  • Business Insight As Business Recommendation

Stage 2 - Preprocessing Data

Stage 2 is another next step that we did manipulation on data before it is used in order to build the model.

What we have done on this stage:

  • Handle Missing Values
  • Handle Duplicated Data
  • Handle Outliers
  • Feature Transformation
  • Feature Encoding
  • Handle Class Imbalance
  • Feature Selection
  • Feature Extraction

Stage 3 - Modeling and Evaluation

Stage 3 is a step where we tested our data train to machine learning model and evaluated it. On this stage we created 7 different preprocessing treatment on datasets. We tested it to 5 different models: logistic regression, decision tree, random forest, XGBoost and CatBoost. The objective of it's action is at the end of this stage we couldn't only know which the better model also the better preprocessing treatment on dataset. As we know, in data science everything is experimental, so we did it to get the better result.

What we have done on this stage:

  • Preprocessing Data
  • Splitting Data (Data Train and Data Test)
  • Feature Engineering
  • Model Testing
  • Tuning Hyperparameters and feature selection
  • Model Selection
  • Evaluation most impacful/influence to model output using shap library

Stage 4 - Final Presentation Material

Based on our project's result, out best model is CatBoost with accuracy score 77% and precision score 77%. Our model performance could increase coupon acceptance rate and B/C ratio by 0.61x (from 1.7x to 2.31x).

NOTE

  • We presented the result of each stage progress in Bahasa