BCF2 Group 2 - Health Insurance Cross-Sell Prediction

Repository for SC1015 Intro to DSAI AY21/22 Sem 2 project

Application of the Machine Learning Pipeline in the order of

  1. Data Extraction & Preparation
  2. Exploratory Data Analysis
  3. Classification
  4. Regression

Team Members:

  1. Chia Lu Ting (EDA)
  2. Lai Shi Hong (Classification)
  3. Daniel Li Runze (Regression)

Context of Dataset: Building a model to predict whether a customer would be interested in Vehicle Insurance is extremely helpful for the company to help it plan its communication strategy to reach out to those customers and optimise its business model and revenue.

Problem Statement: How different variables affect the receptiveness of cross-selling of insurance among existing policyholders? What is the pricing strategy of the insurance company?

Models Used:

Classification:
	1) Decision Tree
	2) Random Forest
	3) MLP
	4) XGBoost

Regression:
	1) LinearRegression
	2) DecisionTreeRegressor
	3) RandomForestRegressor
	4) TransformedTargetRegressor
	5) MLPRegressor

Conclusion:

1) 13-18% of people are interested in the cross-sell
2) Company should target people who faced vehicle damage, have older vehicles, or are in the older age groups.
3) Insufficient data on customer background to conclusively come up with their pricing mechanism. 

References: Dataset: https://www.kaggle.com/datasets/anmolkumar/health-insurance-cross-sell-prediction