/K-Means-CLustering-with-Sample-Wine-Shop-Data

Used Python to performe K-means clustering to explore the dataset of 6000 transaction from a wine shop

Primary LanguageJupyter Notebook

K-Means-CLustering-with-Sample-Wine-Shop-Data

Used Python to performe K-means clustering to explore the dataset of 6000 transaction from a wine shop A CLustering analysis was performed on this sample data set inorder to get insights that would be interesting to the owner of the business and the analysis centered around variables as price, satisfaction, quality and margin on dollar. The Analysis: Using Python to explore the wine dataset using of K-Means Clustering. We have the following results;

clusters size sat quality price margin 0 C1 2722 7.942946 8.991076 5.550367 0.552428 1 C2 2377 5.759276 8.319911 4.407404 0.389718 2 C3 1398 8.835837 16.097069 6.754335 0.557532

Using K-means Clustering we see the statistics of 3 clusters, at a high level we are given the observations along with the mean for the metrics of satisfaction, quality, price and margin, Cluster 1 has the highest observations while Cluster 3 has the lowest observations with 1398. In terms of quality, the case is inverse with cluster 3 having the highest mean quality. In terms of the price, the lowest mean price is cluster 2 and cluster 3 also has the lowest margin. The figures of the clusters and its centroid below also tell a story below.

From the figures we have the results are plotted along the satisfaction and quality of the wine. The statistical analysis of cluster 3 presents compelling findings, indicating that this segment demonstrates superior ratings across key performance indicators such as satisfaction, quality, price, and margin. This presents a strategic opportunity for the company, as this cluster likely represents a niche market with a preference for premium wines and a willingness to pay premium prices, thus resulting in potentially higher profit margins. Through a comprehensive understanding of the preferences and behaviors of this cluster, Wine Shoppe can optimize their marketing, pricing, and inventory strategies to effectively target and cater to this nich market. The decriptive analysis of the cluster below is

The values in the x-axis 1, 2, 3 and 4 represent satisfaction, quality, price and margin, respectively. The box plot shows that the margin variable has a low dispersion of values as the IQR is low. In addition, quality and price have outliers that are higher than the overall distribution of the data while satisfaction has outliers to the bottom part of the data. Cluster 2 customers exhibit high levels of satisfaction, with a relatively narrow interquartile range (IQR) indicating low variability. They also perceive wine quality as very good, with moderate variability within the cluster, including some outliers. Additionally, they tend to purchase higher-priced wines and generate a relatively high profit margin for Wine Shoppe. Overall, Cluster 2 represents a segment of customers who are satisfied with their wine purchases, perceive high quality, are willing to pay higher prices, and contribute to favorable profit margins for the business.