Introduction

This repository holds data and code for my Bachelor thesis: "Exploring machine learning models for churn prediction of membership subscriptions". The goal is to test different ML models to predict if a member will be renewing their yearly membership.

The metric of choice for this classification task of predicting non-renewals will be precision with the acceptable threshold of 75%. That being said the performance metric and threshold might be re-evaluated after performing some EDA on the data and determining class balance.

Business objective

Devise a ML model for for predicting who won't be renewing their membership and to determine if the non-renewing member can be converted 3 months prior to their renewal date.

Extra: If a probability model has good performance then determine users who are within 30% - 50% chance of renewing. Those users have the biggest potential to be converted from non-renewals to renewals.

Data

General facts:

The available data is inspired by PALMS data found in company named Business Networking International (BNI)
BNI creates local groups of entrepreneurs who form relationships and refer eachother's businesses
Company business model: yearly membership subscription
BNI groups meet on a weekly basis

Datasets: PALMS

Data from 2016-03 to 2021-02 in a monthly format
Each months' PALMS data contains information members and chapters performance

Here are a couple of links with a legend/explanation of the PALMS data: Link 1 and Link 2.

ricnorr/bachelor_thesis

Introduction

Business objective

Data

Datasets: PALMS

Dataset: database_data

Dataset: dropped_members_data

Thesis plan