/Clustering

Data visualization and implementation of clustering algorithms on a dataset of football players

Primary LanguageJupyter NotebookMIT LicenseMIT

Clustering

Implementation of some well known clustering algorithms and their analysis.

  • football.csv has the information about 18K football players and their different features, abilities and skills in the game including other attributes like their club, nationality, height etc.

  • 1_data_visualization.ipynb contains visualizations of the information in csv files.

  • 2_KMeans.ipynb contains implementation of Kmeans clustering algorithm from scratch without the use of any inbuilt libraries.

  • 3_Agglomerative.ipynb contains implementation of Agglomerative hierarchical clustering.

  • 3_Divisive.ipynb contains implementation of Divisive hierarchical clustering.

  • 4_DBSCAN.ipynb contains implementation of DBSCAN clustering algorithm.

  • Report.pdf contains our detailed analysis on all the tasks and their comparison.

All the implementations have the initial code of data cleaning same. After each cell, some print statements are added to show the progress of the code.

To visualize the clusters in 2D, PCA was used for dimensionality reduction.