Why do not update batchnorm mean and var during training?

Question

Why do not update batchnorm mean and var during training?

RichardChangCA opened this issue 3 years ago · 1 comments

Hello, Thanks for your source codes.

Could I ask Why do not update batchnorm mean and var during training?
affine=False means do not update batch normalization parameters.

Thanks

Answer 1 · 2022-08-12T11:28:14.000Z

affine=False means that the normalization step of batch norm will not be followed by a linear scaling and offset of the form a * x + b. The argument affine only relates to these affine learnable parameters, not to mean and standard deviation of the normalization, which are still learned.

For Deep SVDD, it is crucial to disable these affine transformations, as stated in section 3.3 of the paper:

Put differently, Proposition 2 implies that networks with bias terms can easily learn any constant function, which is independent of the input x ∈ X . It follows that bias terms should not be used in neural networks with Deep SVDD since the network can learn the constant function mapping directly to the hypersphere center, leading to hypersphere collapse.

Intuitively, if your network contains a bias term, the last layer could just learn to set all weights to zero and the bias to the center c, mapping everything to the center without even taking the input data into account.