cuda版本非m=n=k运算出错

Question

seth-lu opened this issue 2 years ago · 1 comments

如kernel_v3中:
float *begin_a = a + by * BLOCK * k; //by->n
float *begin_b = b + bx * BLOCK; //bx->m

当A,B不为方阵时会出错,例如m=k=256,n=128.

Answer 1 · 2022-11-22T08:18:32.000Z

version 3 fixed for lost in urgly leading dimension x .

Still, I need to add more CI and unittest.