/Clustering

Primary LanguageJupyter Notebook

Clustering

1. Exploration

$ python exploration.py

2. Exploration

2.1 Code

$ python kmeans.py digits-embedding.csv 10

2.2 Analysis

For 2.1 and 2.2, run with the command belowe to get the plot showing the within-cluster sum of squared distances (WC SSD) and silhouette coefficient (SC) as a function of K.

$ python kmeans-analysis_2_12.py

For 2.3, run with the following command to get the average and standard deviation (for WC SSD and SC) for the dfferent values of K.

$ python kmeans-analysis_2_3.py

For 2.4, run with the following command to get the NMI for the choice of K in Step 2 and the visualization.

$ python kmeans-analysis_2_4.py

3. Hierarchical Clustering

For all 5 questions in Question 3, run the following command to get all the results.

$ python hierarchical.py