/k_means_gpu

An implementation of k-means clustering algorithm using GPU acceleration.

Primary LanguageCuda

Intro

This is an implementation of k-means clustering algorithm using GPU acceleration with CUDA C++.

It is made for execution time comparison wrt CPU-only sequential version. For this reason the algorithm will not end when reaching convergence, but you need to specify the number of iterations when launching the program.

Generating dataset

You can generate an N points dataset using datasetgen.py. You have to write also the number of clusters K and the standard deviation. The command will be like:

python datasetgen.py N K STD

example:

python datasetgen.py 1000 3 0.45

Note

The code is made to work on 3 axis but the script will generate points with the third coordinate equal to 0.0. This is to ease the result checking with plot.py

Usage

./Kmeans N K I

Where N is the number of points to read from dataset, K is the number of clusters and I is the number of iterations

Plotting

You can check output results with

python plot.py

Performances

Theese are results obtained using an NVIDIA GeForce GTX 980 Ti

number of points sequential (s) CUDA (s) speed up
10 0.001 0.116 x0.008
102 0.015 0.117 x0.1
103 0.180 0.119 x1.5
104 1.673 0.147 x11.4
105 8.424 0.579 x14.5
106 83.024 5.706 x14.5
107 804.611 54.319 x14.8

Other versions

Acknowledgments

Parallel Computing - Computer Engineering Master Degree @University of Florence.