mirugwe1/Dimensional-Reduction-PCA-Isomap-Multi-dimensional-Scaling-and-KNN-modelling.
The goal of this project is to apply different dimensional reduction methods i.e. Principal Component Analysis (PCA), metric Multidimensional Scaling (MDS), and IsoMap to the MNIST handwritten digits data sets consisting of a greyscale image of digit 5 or 8 represented by one dimension vector of size 785 columns and Wisconsin Diagnostic Breast Cancer dataset-WDBC (source: UCI Machine Learning) consists of 569 data points classified as either malignant or benign to determine which methods and parameters work best on different types of data. We used the KNN algorithm to evaluate the performance of these dimensional reduction methods. KNN models were built both on the original dimension data sets and the dimensionally reduced data to classify digits in the MNIST data or patient's cancer status in the WDBC data. And the difference in the results was used to evaluate the impact of reducing the dimensions on accuracy. Reducing the dimensions of the MNIST handwritten digits data set, slightly improved the performance of the model's classification rate as it increased by only **0.4** i.e. from **98.5%** to **98.9%** for the IsoMap reduction method. PCA and metric MDS did not improve the performance as it reduced from **98.5%** to **96.75** for both methods. For the breast cancer data set, the model's performance only improved when PCA dimensionally reduced was considered. The model **100%** classified the patient's breast cancer status. Other reduction methods did not increase or reduce the classification accuracy from ***92.04%*** which was obtained with original data.
No issues in this repository yet.