K-Means-Clustering-on-Image-dataset: A Jupyter Notebook repository from PranshuPandya

Project Title: K-Means Image Clustering with Feature Extraction on MNIST

Overview

This project explores the application of K-Means clustering to the MNIST handwritten digits dataset. We implement K-Means from scratch and evaluate the impact of different feature extraction techniques (raw pixels, PCA, HOG) on clustering performance.

Prerequisites

Python 3.x
Libraries:
- scikit-learn
- OpenCV (cv2)
- matplotlib
- NumPy
- seaborn (optional, for better visualizations)

Note: The notebook will install the required libraries in the first cell.

MNIST Dataset: it will be automatically downloaded(it is already there in the dataset folder of the repository) Notebook Structure

K-Means Clustering Theory:
- Brief explanation of K-Means clustering concepts.
- Intuition behind the algorithm.
Implementation from Scratch:
- Code for a custom K-Means implementation.
Dataset Introduction (MNIST):
- Description of the MNIST dataset.
- Sample visualizations of the images.
Feature Extraction:
- Rationale for using feature extraction before clustering.
Implementation and comparison of:
- Raw pixel data
- Principal Component Analysis (PCA)
- Histogram of Oriented Gradients (HOG)
Clustering and Evaluation:
- K-Means clustering applied with each feature representation.
- Visualization of clusters.
- Evaluation using metrics like silhouette score and confusion matrices.
Results and Conclusions:
- Summary of findings (key points from the section you provided).
- Insights into the effectiveness of different feature extraction techniques.

How to Run

Execute the notebook cells in order. It will download the requirements and the dataset automatically when needed.

References

K-Means Clustering Intuition: A Simple Explanation of K-Means Clustering: https://www.analyticsvidhya.com/blog/2020/10/a-simple-explanation-of-k-means-clustering/ HOG: Medium Article - HOG Feature Descriptor: https://medium.com/@dnemutlu/hog-feature-descriptor-263313c3b40d

PranshuPandya/K-Means-Clustering-on-Image-dataset

Project Title: K-Means Image Clustering with Feature Extraction on MNIST