How to solve the problem of Loss NaN
lumosity4tpj opened this issue · 2 comments
lumosity4tpj commented
senya-ashukha commented
Hey,
Thank you for checking out at the assignment code!
- lrt_std should not depend on the bias
- make sure you are using adam or something adaptive
You can also look at tf implementation here https://github.com/google-research/google-research/blob/master/state_of_sparsity/layers/variational_dropout/nn.py#L347.
lumosity4tpj commented
Thanks a lot!!! Because of the problem, I tried a lot. Removing bias seems to work.