How to solve the problem of Loss NaN

Question

How to solve the problem of Loss NaN

lumosity4tpj opened this issue 4 years ago · 2 comments

I wanted to implement sparse VD in convolution, but loss was NaN. Would anyone like to provide the code in the case of convolution? I am not very familiar with theano in the original code of the paper, and I still have problems after rewriting.
This is my code:

Answer 1 · 2020-06-29T02:05:21.000Z

Hey,

Thank you for checking out at the assignment code!

lrt_std should not depend on the bias
make sure you are using adam or something adaptive

You can also look at tf implementation here https://github.com/google-research/google-research/blob/master/state_of_sparsity/layers/variational_dropout/nn.py#L347.

Answer 2 · 2020-06-29T02:20:24.000Z

Thanks a lot!!! Because of the problem, I tried a lot. Removing bias seems to work.