/online-retail

Customer segmentation using RFM

MIT LicenseMIT

Customer segmentation using RFM

Topic: Customer Segmentation

Method: Data Wrangling + RFM

Tools: R + Tableau

Data source: Online Retail on Kaggle

Project Objectives

Customer segmentation is an important task in Marketing and Sales since it guides business to market the right products to the right customer group. To do customer segmentation, there are machine learning methods such as K-Nearest Neighbor in supervised learning, or K-Means clustering in unsupervised learning. Another method that is widely used in Database Marketing and Direct Marketing, but does not relate to Machine Learning/Artificial Intelligence is RFM, standing for Recency - Frequency - Monetary Value. In this project, I want to show step by step how data wrangling in R can help marketers to do customer segmentation using RFM method.

Many marketers nowadays use Tableau to visualize data due to its interactive, user-friendly interface and no coding background requirement. Visualizations on Tableau can be turned to informative reports for different stakeholders with different backgrounds. Thus, after getting the customer segments data, I will build Tableau dashboards to visualize these segments' value to business.

Data Source

In this project, I work with the Online Retail data.

There are 240,007 observations of 8 variables as follow:

Variable name Description Data type
InvoiceNo Invoice number, with 12,468 unique values character
StockCode Stock code, with 3,645 unique values character
Description Product name, with 3,606 unique values character
Quantity Number or product(s) ordered integer
InvoiceDate Date and time that an order was placed, starting from December 2010 to June 2011 character
UnitPrice Price per unit of the product ordered double
CustomerID Unique identifier for each customer, with 2,975 unique values character
Country Country that the order was placed, with 38 unique values character

Project Deliveries

  1. Step by step instruction for data wrangling in R
  2. Tableau Dashboard on Tableau Public. Since Tableau does not allow to upload full data due to its size, I only visualize data from December 2010 to April 2011.

Project Directory

|- data
|  -- raw                     Includes raw data
|  -- clean                   Includes clean data
|- doc                        Includes step-by-step instruction and other documentary 
|- src                        Includes source codes for this project
|- LICENSE                    MIT License
|- online-retail.Rproj        R project
|- README.md                  Project Overview

Limitation

  1. This customer segmentation is based solely on customer transactions. If we have data on customer demographic (ie. gender, age, location, yearly salary), we might discover more interesting insights.
  2. Not all customers are segmented due to a lack of CustomerID and purchase information. Some customers only have return transactions, but I need both purchase and return data to segment customers.

Future Plan

I want to try unsupervised learning kmeans on this data to see if this method segments customer similarly or differently from RFM.