/K-NearestNeighborsClassifier

This is the first machine learning assignment that I had completed in the 5th semester of my undergraduate course on Computer Science with Artificial Intelligence and Machine Learning at Dayananda Sagar University, Bengaluru, India in October 2022.

Primary LanguageJupyter Notebook

K-Nearest Neighbors Classifier with custom algorithm and library implementations

Problem statement

Build a k nearest neighbor classifier in Python to classify the MNIST digit data. This is a multi-class classification problem with labels from 0 to 9.

Values

  1. Number of neighbors 'k': 44

  2. Number of data points: 7100

  3. Testing process (cross-validation): 1420 (Randomly picking 20% of the data for testing, and doing the testing k many times; each time picking the 20% test data randomly)

Implementation

Write the code for a custom kNN algorithm, and generate an averaged confusion matrix. Repeat the process using sklearn's KNeighborsClassifier library classifier.

Anomalies

  1. Sometimes in the Custom kNN classifier, due to losses in the decimal places, when calculating the total number of elements in the final confusion matrix, the answer may not always be the exact number of testing datapoints used, rather it could produce a value that is extremely close to the intended answer.

  2. Since the confusion matrix that we are printing in the end holds the average value, the values of the elements may have decimal places; as it is not necessary that the values add up to a value that is divisible by k.

List of Files

File Name Description
K-Nearest Neighbors Classifier - Custom A kNN classifier using a hardcoded algorithm
K-Nearest Neighbors Classifier - Library A kNN classifier using scikit-learn's library classifier
Documentation Documentation of the project