Question regarding layer equalization
Closed this issue · 2 comments
suwaxu commented
I noticed that you adjusted weight and bias of batchnorm layer by multiplying them by s
Line 65 in 6f15805
Since layer equalization happens after batch norm folding, the weight and bias in conv layer should already be updated by folding batchnorm's parameters. I am wondering why it is still necessary to update batchnorm parameters (multiplying them by s) here. Thanks a lot!
jakc4103 commented
The bn_weight here is actually not involved in the computation graph. It's just a vector to keep track of the output feature mean (fake_bias) and std (fake_weight), which are used to compute the value range for feature quantization.
suwaxu commented
Thanks so much for you prompt reply