/Block_Matrix_Multiply_CUDA

Block matrix multiply parallel algorithm using CUDA

Primary LanguageCuda

Block_Matrix_Multiply_CUDA

Block matrix multiply parallel algorithm using CUDA

Compile:

nvcc -O2 bmm_main.cu bmm.cu -o bmm

Execute:

./bmm M

Note that $N$ is equal to $2^M$.