In this notebook we will explore different approaches for clustering using the credit card dataset available on kaggle. After initially exploring the dataset to get familiar with the data, we will carry out data cleaning to prepare the data for the clustering task. Handling skewness, deleting outliers, handling null values and duplicate values are amongst the actions that we will perform to clean our data. Afterwards, we consider the clustering problem. K Mean, Agglomerative Hierarchical, DBSCAN, and Mean Shift clustering are applied on our data. Finally, we have drawn some conclusion and results.
Author: Arash Sadeghzadeh
Data for this notebook can be retrieved from the following URL: https://www.kaggle.com/datasets/arjunbhasin2013/ccdata
Table of Contents Data Exploration Data Cleaning Handeling Skewness Deleting Outliers Handling Null Values Handling Duplicate Values Applying PCA Clustering K Means Clustering K Means Methos Tuning Hyperparameters Representing the Results Agglomerative Hierarchical Clustering AHC Method Tuning Hyperparameters Representing the Results DBSCAN Clustering DBSCAN Method Tuning Hyperparameters Representing the Results Mean Shift Clustering Mean Shift Method Tuning Hyperparameters Representing the Results Conclusion and Final Results