kironjayesh/Enhanced-k-mean-with-automated-k-estimation-and-outlier-detection-using-auto-encoder-neural-network
Given a data-set with vast amounts of data, it can become complicated to study unstructured data. The clustering technique helps in finding structure, and groups a set of data in such a way that the items in that group are more comparable to each other than the other elements. K-means is one such clustering technique that works on the principle of minimizing intra-cluster distance and maximizing inter-cluster distance. Still, the major drawback of this algorithm is the process of choosing the right K value and its inability to detect noise. To overcome the disadvantages mentioned above, this paper discusses various techniques—the use of naïve-sharding centroid initialization to detect the initial centroids, which improves the efficiency of the method drastically. A more effective elbow method has been proposed, which helps in identifying the optimal number of clusters. Finally, anomaly detection using auto-encoder neural networks is used to filter out outliers/noise.
Jupyter Notebook
Stargazers
No one’s star this repository yet.