This Python application demonstrates K-Means clustering on various datasets and provides a modularized structure for loading data and performing clustering. The code is organized into two modules: data_loader
and clustering
.
These instructions will help you set up and run the project on your local machine.
- Python 3.x
- NumPy
- scikit-learn
- seaborn
- matplotlib
- ruff
- pytest
You can install the required dependencies using pip:
pip install numpy scikit-learn seaborn matplotlib ruff pytest
- Clone the repository:
git clone https://github.com/rohit1901/py-cluster.git
cd py-cluster
- Run the main script:
python main_1.py
python main_2.py
data_utils
module: Responsible for loading data and extracting dimensions and samples.clustering
module: Implements K-Means clustering and related functions.classify_unknown_samples
module: Implements a function to classify unknown samples using a trained model.main_1
script: Demonstrates classification of unknown samples using nearest neighbour classification.main_2
script: Demonstrates K-Means clustering on various datasets.
To run unit tests for the application, use the following commands:
pytest
This project is licensed under the MIT License - see the LICENSE file for details.
- This project was inspired by the need to understand K-Means clustering and its implementation in Python.
- Thanks to the contributors and open-source libraries that made this project possible.