Customer segmentation model based on credit behavior

Description 💬

Currently, many organizations rely on different data-based methods to segment their market by creating subsets based on demographics, needs, priorities, common interests, and other psychographic or behavioral criteria used to understand better their target audience.

Having said that the main objective of this project is to build a customer segmentation based on credit card payments behavior during the last six months to define marketing strategies.

Tools 🔨

The tools used to develop the project were:

Jupyter Notebook - Python
Pandas
Numpy
scikit-learn
Tableau
Heroku

Architecture

The following diagram shows the architecture used in the project, from the data source to the deployment of the website so that users can interact with the application.

Project Structure

Folder structure

app: Files needed for the creation of the website with Streamlit and deployed on Heroku.
images: Images used in the project
customer_segmentation.ipynb: Main notebook where all the data is read and processed until the trained model is obtained.

Data

The data source was taken from the Kaggle challenge called Credit Card Dataset for Clustering, where you can find the summary of payments behaviors of 9000 credit card active owners during the last six months.

Analysis

Balance

The Balance variable presented a mean of 1601.22 USD with a standard deviation of 2095.57 USD. Regarding its distribution, it's clear that it has a bias to the right, as is to be expected when working with financial data with a high number of atypical values.

Graph 1

Credit Limit

When it comes to the CREDIT_LIMIT variable, it has a mean of 4522.09 USD and a standard deviation of 3659.24 USD. Regarding its distribution, it's clear that it has a bias to the right.

Balance

Annotations 📢

This article describes the overall project, methods, main findings, conclusions and recommendations for future work.

Authors ✒️

Carlos Barros Github, LinkdIn

cbarros7/holbertonschool-final_project