/mixed-data-clustering

Application of information entropy for mixed data clustering

Primary LanguagePython

Application of information entropy for mixed data clustering

The main theme of this paper is mixed data clustering analysis and implementation. Clustering algorithm was implemented based on J. Liang, et al., Determining the number of clusters using information entropy for mixed data, Pattern Recognition (2012), doi:10.1016/j.patcog.2011.12.017 paper. Main contributions were mixed datatype visualization using multidimensional scaling met- hod, non random initial centers choose, validation of clustering results using Dunn and Davie- Bouldin indexes. Modified algorithm was tested with synthetical and real data sets. Originally real data sets were with known groups, but before clustering, labels from real data sets were removed. In all cases true cluster number was detected correctly.