This repository is for Continuous Integration of my simple k-Nearest Neighbors (kNN) algorithm to pypi package.
For notebook version please visit this repository
k-Nearest Neighbors, kNN for short, is a very simple but powerful technique used for making predictions. The principle behind kNN is to use “most similar historical examples to the new data.”
- Choose a value for k
- Find the distance of the new point to each record of training data
- Get the k-Nearest Neighbors
- Making Predictions
- For classification problem, the new data point belongs to the class that most of the neighbors belong to.
- For regression problem, the prediction can be average or weighted average of the label of k-Nearest Neighbors
Finally, we evaluate the model using k-Fold Cross Validation technique
This technique involves randomly dividing the dataset into k-groups or folds of approximately equal size. The first fold is kept for testing and the model is trained on remaining k-1 folds.
pip install simple-kNN
from simple_kNN.distanceMetrics import distanceMetrics
from simple_kNN.kFoldCV import kFoldCV
from simple_kNN.kNNClassifier import kNNClassifier
- My medium article on building kNN from scratch
- More info on Cross Validation can be seen here
- kNN
- kFold Cross Validation
- Other variants of kNN algorithm
- Recommendations using kNN algorithm