/cuda-v100-kernels

CUDA Kernels on V100

Primary LanguageCudaGNU General Public License v3.0GPL-3.0

CUDA Kernels on V100

Few CUDA Kernels on V100. Mainly used to demonstrate optimization methods.

For minimal dependency requirement, use Makefile to build all executables.

File structure

// reduce operation
reduce/

// Scan operation
scan/

// Square matrix transpose
transpose/

// General matrix multiply C = A * B
sgemm/