
About Normalization

Opened this issue · 0 comments

This is a questioin. I managed to do some improvement based on this method. I wonder will BatchNorm do any effect to the result, since attention is used to calculate the relationship between the image using dot. But batch norm will change the mean and std. I just want to ask anybody have the same question ?