Mismatch of dweight at layernorm_backward.cu
foreverpiano opened this issue · 0 comments
foreverpiano commented
with the following script
nvcc -O3 --use_fast_math -lcublas -lcublasLt layernorm_backward.cu -o layernorm_backward
./layernorm_backward 2
output
Using kernel 2
Checking correctness...
dinp:
-1.182338 -1.187500
0.236102 0.236328
0.667884 0.667969
-1.111703 -1.117188
0.685677 0.687500
dweight:
8.187102 4.125000
Mismatch of dweight at 0: CPU_ref: 8.187102 vs GPU: 4.125000
90.587265 104.000000
Mismatch of dweight at 1: CPU_ref: 90.587265 vs GPU: 104.000000
-38.522667 -39.750000
114.101608 108.500000
58.895130 64.500000
Mismatch of dweight at 4: CPU_ref: 58.895130 vs GPU: 64.500000
Mismatch of dweight at 7: CPU_ref: -5.614801 vs GPU: -7.562500
Mismatch of dweight at 9: CPU_ref: 67.834213 vs GPU: 91.000000
Mismatch of dweight at 10: CPU_ref: -1.165685 vs GPU: 4.500000
Mismatch of dweight at 11: CPU_ref: 131.065735 vs GPU: 98.500000
Mismatch of dweight at 17: CPU_ref: -19.924236 vs GPU: -15.812500
Mismatch of dweight at 19: CPU_ref: -31.724333 vs GPU: -28.625000
Mismatch of dweight at 21: CPU_ref: 24.697800 vs GPU: 29.000000