matrix-multiply
There are 6 repositories under matrix-multiply topic.
Bruce-Lee-LY/cuda_hgemm
Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.
Bruce-Lee-LY/cuda_hgemv
Several optimization methods of half-precision general matrix vector multiplication (HGEMV) using CUDA core.
Bruce-Lee-LY/cutlass_gemm
Multiple GEMM operators are constructed with cutlass to support LLM inference.
Bruce-Lee-LY/matrix_multiply
Several common methods of matrix multiplication are implemented on CPU and Nvidia GPU using C++11 and CUDA.
Bruce-Lee-LY/cuda_back2back_hgemm
Use tensor core to calculate back-to-back HGEMM (half-precision general matrix multiplication) with MMA PTX instruction.
JoKeRooo7/matrix_c
c lib for calculating matrices