According to the Leo Vegas CEO, Gustaf Hagman (see the picture below), in 2018 the overall revenue of the company was around €360 million. Considering that online gambling is only 10% of the total gambling and that there is an increasing trend in everything going online the perspective of growing is promising.
In this growing market keeping customers satisfied is paramount so they continue using the product. In this scenario, customer churn prediction is a very reasonable alternative to approach customer retention based on its past behavior.
From a dataset containing daily aggregations for around 10k customers who signed up during a calendar year define the prediction target and create a predictive model that you would be comfortable sharing with a hypothetical stakeholder.
After an initial exploratory data analysis and feature engineering, the probabilities generated by the model BG-NBD were used to define the target variable (churn/not churn). To predict customer churn 5 models were build, compared and evaluated. The final criteria to choose the model was a trade-off between performance and explainability.
Create and activate a conda environment of your choice, here I call it churn_lv:
conda create --name churn_lv python==3.8
conda activate churn_lv
pip install -r requirements.txt
The notebooks should be executed in the following order:
- eda.ipynb:
- performs an exploratory data analysis
- feature_eng.ipynb:
- performs feature engineering
- bg-nbd_model.ipynb:
- implements the model BG-NBD
- target_definition.ipynb:
- uses the results from the model BG-NBD to define the target variable
- model_building.ipynb:
- build, select and evaluate the model
- model_building_reduced_features.ipynb:
- remove some features, build, select and evaluate the model
.
├── assets
├── data
├── .ipynb_checkpoints
├── models
├── notebooks
├── README.md
├── references
├── reports
├── requirements.txt
└── src