A demo code that compares the speed of two matrix multiplication functions executed by the CPU and the GPU
The repo contains a single source code that can be executed with a CUDA installed environment. The code executes a 1000 x 1000 matrix multiplication first using the CPU, and a second time using the GPU. The execution times of both operations are reported.
Matrix size and CUDA block size can be adjusted as needed.