batchnorm destabilize training

Question

batchnorm destabilize training

dichencd opened this issue 2 years ago · 0 comments

Hi @stepankonev, really appreciate your great work!

I tried to play with your code, but when I tried to add batch normalization layers, the training became unstable. (some values become very large). I wonder whether you have encountered a similar issue? If so, how did you resolve it?

Meanwhile, I wonder how did you choose the value for the data normalization? Is there a reason why you chose to do the data normalization?

Thank you very much!