/M3_H06

K-means clustering exploration on 2D and MNIST datasets with PCA visualization. Find the balance in data, reduced and clustered.

Primary LanguageJupyter NotebookMIT LicenseMIT

Clustering Analysis of 2D Data and MNIST Dataset

Project Overview

This project applies K-means clustering to both a simple 2D dataset and the more complex MNIST dataset of handwritten digits. For the MNIST dataset, PCA (Principal Component Analysis) is utilized to reduce the dimensionality to two principal components before clustering.

Features

  • Implementation of K-means clustering algorithm.
  • Use of the elbow method to determine the optimal number of clusters.
  • Dimensionality reduction using PCA for the MNIST dataset.
  • Data visualization for both the 2D and PCA-reduced MNIST datasets.

How to Run

Ensure that you have Python installed on your system, along with the following packages:

  • pandas
  • matplotlib
  • scikit-learn

Clone the repository and navigate to the project directory:

git clone https://github.com/UkrainianEagleOw/M2_H06.git
cd M2_H06

Run Jupyter Notebook or JupyterLab, and open the .ipynb file:

jupyter notebook

Follow the instructions within the notebook to run the analyses.

Results

The analysis includes an elbow plot to determine the optimal number of clusters and scatter plots to visualize the results of the K-means clustering.

License

This project is licensed under the terms of the MIT license.

Contact

For any queries or discussions, reach out to linkedin.com/in/dmytro-filin.