/ChurnPrediction

Primary LanguageJupyter Notebook

Churn prediction for customers in the banking system

Objectives

For any business organization, it is very important to know what are the reasons for customers to leave their services. Churn modeling allows companies to develop loyalty programs and retention campaigns to hold back as many customers as possible. In this context,we set our objectives below.

  • Applying different machine learning and deep learning algorithms on customer churn data set from a bank.
  • Constructing a predictive model for the clients that are likely to churn.
  • Verifying the accuracy of the models with different accuracy tests.

Introduction

For a company with huge customer base, it is highly unlikely to identify the customer churn manually and to track the pattern of who would stop the business with the company. Hence companies use the data mining techniques that are efficient in predicting customer churn. To build an effective churn model it is vital that we choose the right variables or features from the customer data and to choose an effective model that best suits the feature set. Churn models aim to identify early churn signals and recognize customers with an increased likelihood to leave voluntarily. To tackle the churning prediction models methods like decision tree, logistic regression, SVM, Naïve Bayes, sequential pattern mining, linear discriminant analysis etc are used in modern days.

Data Pre-processing

Some minor initial changes included standardizing, capitalization and replacing spaces with underscores to make our data more compatible with various types of data mining tools. We applied many feature selection methods: Correlation Matrix with Heatmap,Zero Importance Features, Single Unique Value Features and Low Importance Features .

Methodology

We have split the test and train data accordingly. We predicted the test dataset accuracy by learning from the train dataset. We have applied Artificial Neural Networks(ANNs),XGBoostt, Logistic Regression, Linear Discriminant Analysis, Quadratic Discriminant Analysis, KNN, Decision Tree, and Support vector machine for prediction.