/K-Means-Clustering-from-Scratch

Build the customer segmentation model using K-Means Clustering Algorithm

Primary LanguageJupyter Notebook

K-Means-Clustering-from-Scratch

Build the customer segmentation model using K-Means Clustering Algorithm

KMeans Clustering

K-means clustering is one of the simplest and most popular unsupervised machine learning algorithms. You’ll define a target number k, which refers to the number of centroids you need in the dataset. A centroid is the imaginary or real location representing the center of the cluster. Every data point is allocated to each of the clusters by reducing the in-cluster sum of squares. In other words, the K-means algorithm identifies k number of centroids, and then allocates every data point to the nearest cluster, while keeping the centroids as small as possible. The ‘means’ in the K-means refers to averaging of the data; that is, finding the centroid.

About the dataset

Assume that you are working as an analyst for “Pizzario”, a pizza delivery chain. The group has collected some interesting characteristics of customers who had purchased their pizza earlier. (Refer to the attached pizza_customers.csv file for the same). The marketing team is planning a campaign to increase the sales of a newly launched pizza. Before that, they want to analyze the segmentation of existing customers so that they can have a clearer picture of the customer categories.