KNearestNeighbors

Objective

Optimize two hyperparameters (K-values and Distance Function) for K-Nearest Neighbor Model.

Distance Function: [Euclidean, Minkowski, Manhattan, Hamming]
K-Value: output has 3 classifications, recommend an even K-Value

Model

K-Nearest Neighbors: A non-parametric classification model that calculates the distance of n-test observations from all the observations of the training dataset and output as the class with the highest frequency from the K-most similar instances.

KK Disciplines
Lazy Learning: Training is not required and all of the work happens at the time a prediction is requested.
Instance-Based Learning: Raw training instances are used to make predictions.
Non-Parametric: KNN makes no assumptions about the functional form of the problems being solved.

AVOID

Curse of Dimensionality: As the number of dimensions increases the volume of the input space increases at an exponential rate

Repository File Structure

├── src          
│   └── main.py              # Optimize two hyperparameters K-values with variants of Distance Function for K-Nearest Neighbor Model
├── plots
│   └── ErrorRatekValue.png  # Error Rate K-Value
├── requierments.txt         # Packages used for project
└── README.md

Outputs & Distance Functions

Euclidean Distance

Euclidean Distance
K-Nearest Neighbor Accuracy: 96.67%

Minkowski Distance

Minkowski Distance
K-Nearest Neighbor Accuracy: 96.67%

Manhattan Distance

Manhattan Distance
K-Nearest Neighbor Accuracy: 93.33%

Hamming Distance
K-Nearest Neighbor Accuracy: 93.33%

Data

Target Class:
Iris-setosa       float64
Iris-versicolor   float64
Iris-virginica    float64

Features:     
Sepal-width       float64
Sepal-length      float64
Petal-width       float64
Petal-length      float64