NVIDIA/FasterTransformer

cuda 11.7 and cuda 11.8 gives different results for decoder self-attention?

frankxyy opened this issue · 0 comments

I found that for different cuda toolkit versions of 11.7 and 11.8. The results of decoder self-attention is different. Cuda 11.8 gives expected result. Why does this happen?