Loss becomes nan after ~200 steps.

Question

Loss becomes nan after ~200 steps.

Thien223 opened this issue 5 years ago · 2 comments

Thien223 commented 5 years ago

Thank you for your work.

I have an issue, after ~200 training steps, the loss becomes nan as following:

loss=nan, log_p=nan, logdet=nan]922]t=5.16016]

is this normal, do you have any experience to fix this issue? Thank you so much

Answer 1 · 2019-08-07T06:57:42.000Z

Hi. Thanks for the question.
I would recommend you train model from a dev branch and use float32 dtype and scale = 1. This problem occurs in the ActNorm layer and appears only in the early steps of training. Just try to start training again.

Answer 2 · 2019-08-07T07:17:22.000Z

Thank for your quick reply.

I'm not sure about what dev branch is, but I have changed the dtype to float32 in hparams, and the scale to 1.

It seems to be OK now.

Thank you.