/shgemm

Fast multiplication of single-precision and half-precision matrices on Tensor Cores

Primary LanguageCuda

No issues in this repository yet.