- use data from mini project #1 (or other), begin with |N|≥500, |D|≥10)
- client-server system: python for processing (server), D3 for VIS (client)
- implement random sampling and stratified sampling
- the latter includes the need for k-means clustering (optimize k using elbow)
- find the intrinsic dimensionality of the data using PCA
- produce scree plot visualization and mark the intrinsic dimensionality
- obtain the three attributes with highest PCA loadings
- visualize data projected into the top two PCA vectors via 2D scatterplot
- visualize data via MDS (Euclidian & correlation distance) in 2D scatterplots
- visualize scatterplot matrix of the three highest PCA loaded attributes
Youtube link to view results : https://www.youtube.com/watch?v=EV9T5XxKSWc
- Renderings_with_imagevis_task_3 folder