/CUDA_Tiled_2D_Convolution

Tiled implementation of a 2D matrix convolution by utilizing the shared and global constant memory within GPU thread blocks to minimize the memory bandwidth bottleneck and achieve a higher performance speedup.

Primary LanguageCuda

Stargazers