about ldg32_nc_0
YijiaZhao opened this issue · 3 comments
YijiaZhao commented
https://github.com/tpoisonooo/how-to-optimize-gemm/blob/master/cuda/MMult_cuda_12.cu: 20,21
I'm a beginner of CUDA&&PTX, I want to know what does these two PTX use for?
"{.reg .pred p;\n"
"mov.b32 %0, 0;\n"
is it useless code?
tpoisonooo commented
For .reg .pred p;
yes it is useless. The code is originally used for predicate guard, to handle conditional execution.
mov.b32 %0, 0
is used for clean reg. If you do not like it, just remove it.
YijiaZhao commented
thank you