/E_commerce

Primary LanguageJupyter Notebook

E_commerce

logo

Data source

the file contains the dataset pertaining to 10K records of information about:

  • Date of the order and the shipment
  • Shipment Mode
  • Customer Id,name,segment
  • Adress of the customer (country,state,city,postel code)
  • Product Id,name, category,sub-category
  • Sales
  • Discount
  • Profit
  • Quantity

Project Introduction

The project focused on exploratory data analysis and database design (SQL), identification of data-driven business strategies and their presentation in an interactive Tableau dashboard.

Data base design

On the firs step, I created a data_base using SQLite &Python and made a ERD. For further analysis I performed some queries.

Explaroty data analysis

In EDA I investigated:

Additionaly, to understand the customers buying pattern 3 parameters were calculated :

  • Monetary
  • Frequency
  • Recency (how many days has passed from the last order date in the dataset to customers last purchase)

ML Clustering with K-means

To find meaningful structure, explanatory underlying processes, generative features, and groupings inherent in a set of examples I used unsupervised learning method - Clustering with K-means.

After removing outlires, feature scaling I used The Elbow Method and Silhouette Analysis to determine the optimal number of clusters into which the data may be clustered.

After obtaining the optimal number of clusters (4), I used the K-means clustering algorithm to divide the clients into 4 clusters according to Monetary,Frequency,Recency.

clusters

To understand customer clusters, I researched what products customers from different clusters were buying.