- PCA and LDA are implemented to perform dimensionality reduction on the dataset and K-NN classifier is used for classification
- Data is split into training set and test set with 50-50% split
- Every image is converted into a vector
- Preprocessing is performed on the training set by performing Mean Normalization to center the data and scale it to prevent the principal components to be skewed towards a certain feature
- The covariance matrix is computed from the training set and eigenvectors are computed from it
- The first K eignevectors are chosen to retain 90% of the total variance and used as the projection matrix
- The training set is projected to the lower dimensional space and used to train the K-NN classifier
- The test set is then projected on the lower dimensional space and used to report accuracy of the classifier
- As LDA fails to find the lower dimensional space if the dimensions are much higher than the number of samples in the data matrix. Thus, the within-class matrix becomes singular, which is known as the small sample size(SSS) problem, so PCA is performed before LDA to regularize the problem and avoid over-fitting.
- Data is split into training set and test set with 50-50% split
- Every image is converted into a vector
- PCA is perfromed on the training set and the number of PCs chosen equals the rank of the within-class matrix before PCA
- The training set is then projected on the lower dimensional space
- The training set is splitted by class label
- The mean vector for every class is computed
- Using the mean vector of every class the between-class matrix is computed
- The eignevectors of are computed
- The first 39 eignevectors are chosen to used as the projection matrix
- The training set is projected to the lower dimensional space and used to train the K-NN classifier
- The test set is then projected on the lower dimensional space and used to report accuracy of the classifier