The simplest but fast implementation of matrix multiplication in CUDA.
Primary LanguageCudaMIT LicenseMIT