Clustering algorithm implementaions from scratch with python (k-means, EM-GMM, mean-shift, agglomerative)
Python
Clustering Algorithm Implementation and Visualization from Scratch with Python
Overview
This project implements four popular clustering algorithms from scratch in Python, designed to work for datasets with d >= 2 dimensions and k >= 2 clusters. The implementations are tested on 2D datasets and compared visually with scikit-learn's implementations to evaluate correctness and performance.
Implemented Clustering Algorithms
K-Means Clustering
Gaussian Mixture Model (GMM) using Expectation-Maximization (EM)
Mean-Shift Clustering
Agglomerative Clustering
Python Implementations
KMeans.py: K-Means clustering.
KMeans_Ver0.py: K-Means clustering (2nd version).
GaussianMM.py: EM-GMM.
GaussianMM_Ver0.py: EM-GMM with functions of AIC, BIC and predict (2nd version).
MeanShift.py: Mean-Shift clustering.
Agglomerative.py: Agglomerative clustering.
Evaluations and Tests
test_2d_visualization.py:
Tests each implementation on 2D datasets with visualization, comparing the results to scikit-learn's equivalent algorithms.
data_2d_test/:
Contains the datasets used for testing.
test_2d_visualization_results/:
Stores the output images of the clustering results.