
A possible mistake in the FLOPs calculation of attn_output_layer_norm in the file

hrheru2021 opened this issue · 0 comments

In the file, I find that the FLOPs calculation of attn_output_layer_norm (in Line 77) does not include self.h multiplication factor. This is different from the FLOPs calculation of output_layer_norm (in Line 85) which includes self.h multiplication factor. It seems that the code logic of these two lines is exactly the same, so that self.h multiplication factor is missed in the FLOPs calculation of attn_output_layer_norm.