kmeans-cuda

COMP 633 project: kmeans in cuda

todo

the compute capability of Titan V is 7.0 (on phaedra)
the compute capability of GTX 1080 Ti is 6.1 (my desktop)
the cuda version I'm using is 9.2 (on phaedra)
Maximum x-dimension of a grid of thread blocks: 2^31-1, starting from cc3.0
Maximum number of threads per block: 1024, starting from cc2.0
Maximum x- or y-dimension of a block: 1024, starting from cc2.0
atomicAdd_system, (atomic through all CPUs and GPUs, may not need it)starting from cc6.0
32-bit floating-point version of atomicAdd(), starting from cc2.0
64-bit floating-point version of atomicAdd(), starting from cc6.0