Management and analysis of physical dataset project
Implement and benchmark alternatives of common clustering algorithms in Spark environment, without using the related already provided functions.
The project is thus focused on the efficient implementation of algorithms in a distributed system.
Mini-batch k-Means, K-means ++, K-means ||