This repository contains a Python implementation of the K-Means clustering algorithm. K-Means is a popular unsupervised learning algorithm used to cluster data points into groups based on their similarity.
The kmeans.py
script executes the following steps:
- Initialization: Randomly initialize cluster centroids.
- Assignment: Assign each data point to the nearest cluster centroid.
- Update: Update cluster centroids based on the mean of data points assigned to each cluster.
- Convergence: Repeat steps 2 and 3 until no data points change (called convergence).
Ensure you have Python 3
and pip
installed on your system. Additionally, this program requires the
matplotlib
library to plot the clustering results.
You can install matplotlib
with pip by using the following command:
pip install matplotlib
-
Clone the Repository:
git clone https://github.com/joaotav/kmeans-cluster-visualizer.git
The repository contains the following files:
kmeans_clustering.py
: The main Python script implementing the K-Means algorithm.dataset.txt
: Input dataset containing coordinates of data points.result.dat
: Output file containing the final clustering result.clusters.png
: Output file containing the visualization of clustered data points.
To execute the clustering algoritm based on the dataset specified in the dataset.txt
file, run the following command:
python3 kmeans.py