Intro

This is an implementation of k-means clustering algorithm using GPU acceleration with CUDA C++.

It is made for execution time comparison wrt CPU-only sequential version. For this reason the algorithm will not end when reaching convergence, but you need to specify the number of iterations when launching the program.

Generating dataset

You can generate an N points dataset using datasetgen.py. You have to write also the number of clusters K and the standard deviation. The command will be like:

python datasetgen.py N K STD

example:

python datasetgen.py 1000 3 0.45

Note

The code is made to work on 3 axis but the script will generate points with the third coordinate equal to 0.0. This is to ease the result checking with plot.py

Usage

./Kmeans N K I

Where N is the number of points to read from dataset, K is the number of clusters and I is the number of iterations

Plotting

You can check output results with

python plot.py

Performances

Theese are results obtained using an NVIDIA GeForce GTX 980 Ti

number of points	sequential (s)	CUDA (s)	speed up
10	0.001	0.116	x0.008
10²	0.015	0.117	x0.1
10³	0.180	0.119	x1.5
10⁴	1.673	0.147	x11.4
10⁵	8.424	0.579	x14.5
10⁶	83.024	5.706	x14.5
10⁷	804.611	54.319	x14.8

Other versions

CPU-only sequential version
there is also a distributed system version

Acknowledgments

Parallel Computing - Computer Engineering Master Degree @University of Florence.