This Python notebook contains the code and documentation for the 3rd programming assignment for the "Μηχανική Μάθηση" (Machine Learning) course (ΕΠ08) at NKUA. In this assignment, we perform various tasks related to machine learning, including data preprocessing, hyperparameter tuning, SVM classification, and dimensionality reduction using PCA (Principal Component Analysis).
To run this notebook, follow these steps:
-
Mount Google Drive: Before starting, you need to mount your Google Drive to access the necessary data files. The code for mounting Google Drive is provided in the notebook.
-
Data Files: The assignment assumes the presence of two data files:
mnist_test.csv
andmnist_train.csv
. Make sure these files are available in your Google Drive at the specified paths. -
Dependencies: Ensure that you have the required Python libraries installed, such as NumPy, pandas, scikit-learn, and others. You can install these libraries using pip or conda.
This section includes importing necessary libraries and reading the data files into Pandas DataFrames. It also normalizes the datasets.
In this section, we split the datasets into training and testing sets.
We perform hyperparameter tuning for an SVM (Support Vector Machine) classifier using GridSearchCV. The best parameters are determined for the SVM model.
We train an SVM classifier using the best parameters obtained from hyperparameter tuning. The classifier is evaluated on the test dataset, and classification metrics such as precision, recall, and F1-score are calculated.
In this section, we perform dimensionality reduction using PCA. We reduce the dataset to different percentages of sustained covariance (0.95 and 0.75) and evaluate the SVM classifier's performance on the reduced datasets.
The results of the SVM classifier and dimensionality reduction experiments are provided in the notebook. These results include accuracy, precision, recall, F1-score, and the execution time for each experiment.
This project is licensed under the MIT License - see the LICENSE.md file for details.