BDP 05: CLUSTERING OF LARGE UNLABELED DATASETS
- K-Means Clustering on the Ages of the Users dataset.
- K-Means Clustering on two features of the Posts dataset (Score and ViewCount)
2D Clustering with Join of Datasets
- Includes the helper MapReduce used to preprocess the data and join the datasets.
- K-Means Clustering of the User Ages and Badges Count of the Users and Badges dataset.
- Helper MapReduce used to normalise tha data points.
Extract Sample From Posts
- Helper MapReduce used to extract sample data from the Posts dataset for plot purposes.