batchnorm destabilize training
dichencd opened this issue · 0 comments
dichencd commented
Hi @stepankonev, really appreciate your great work!
I tried to play with your code, but when I tried to add batch normalization layers, the training became unstable. (some values become very large). I wonder whether you have encountered a similar issue? If so, how did you resolve it?
Meanwhile, I wonder how did you choose the value for the data normalization? Is there a reason why you chose to do the data normalization?
Thank you very much!