-- THIS ANALYSIS IS DONE IN R USING JUPYTER NOTEBOOK --

What is Customer-Segmentation?

As the name suggests, this project is about dividing the customer group into different segments. The idea is to group customers who share similar characteristics. How these groups are formed are based on business objectives and the data available. Through this project we can derive insights into Customer LifeTime Value, Purchase Channel and Product proclivities, so a business can tap into the information to guide future decisions.

Customer segmentation can be achieved using a variety of customer demographics such as age, gender, marital status, etc. However, such information is not easily available. What is easily available is TRANSACTIONAL DATA (Customer Accounts, Invoices, Invoice Dates and Times, etc.) How can the customers, now be segmented?

Although it depends on the business objectives, lets use RFM (Recency, Frequency and Monetary Value) metrics to identify high value and low value customers of the business, so that they can be used for marketing purposes.

Data

The data was obtained from UCI Machine Learning repository https://archive.ics.uci.edu/ml/datasets/Online+Retail

RFM (Recency, Frequency and Monetary Value) Variables

As previously mentioned, the data did not include any demographic information of the customers, so using the new metrics to segment!

RECENCY -- How recently has the customer made his/her purchase?
FREQUENCY -- How frequent is the customer? How many purchases over the given time frame?
MONETARY VALUE -- How much amount does each customer bring in?

Dealing with Outliers - A Pareto Analysis

The rule says that more or less, 80% of the results come from the 20% of the causes! In this context, 80% of sales are caused by 20% of the customers. Meaning, top 20% customers contribute most to the sales -- these are our high value customers!

This is a very hard to read, reason being our RFM variables are highly skewed!

Transformed RFM

In this project, outliers are VERY IMPORTANT ! Outliers are customers who are either high value customers or are low value customers! Both of these groups present useful information. Therefore, I will include them in the analysis!

Modelling - Using k-means!!!

Why did I use k-means?

K-MEANS gives disjoint sets - I wanted each customer to belong to one and only one segment!
The data set had around 541,000 customers. Therefore, time complexity could be an issue. K-means has a linear time complexity O(n) as opposed to hierarchical which has a quadratic complexity - O(n^2)!

Optimal number of clusters?

To get the optimal number of clusters -- we can do a number of things ---

Elbow method - Gave me 2 or 3 cluster solution
Silhouette method - Gave me 2 cluster solution
Gap - Statistic method - Gave me 6 cluster solution

Final Cluster Solution

My take

The decision should be based upon how the business plans to use the results, and the level of granularity they want to see in the clusters. In my opinion, 4 cluster-solution should be the best, where 1 group is high value customers; 2 groups mid value customers and 1 group being the zero value/low value customers with low frequency and low revenue and who were not very recent.

pareshg18/Customer-Segmentation