/Benchmark-Mixed-Clustering

Benchmarking clustering algorithms suited for mixed data

Primary LanguageJupyter Notebook

Benchmark-Mixed-Clustering

Benchmarking clustering algorithms suited for mixed data.

Algorithms implemented :
- K-Prototypes
- KAMILA
- Modha-Spangler
- FAMD-KMeans
- DenseClus
- ClustMD TO DO - Hierarchical clustering with Gower's Distance
- MixtComp
- KCMM TO ADD
- Pretopological Clustering (with FAMD, Laplacian Eigenmaps, UMAP and PaCMAP)

Benchmark over computation cost (memory usage, execution time) and internal validity indices (Calinski, Davies-Bouldin, Silhouette).

Use of real world data (see /data/ folder) and generated data.