DirtyHarryLYL/HAKE-Action-Torch

Loss become nan during AE pretrain

neolifer opened this issue · 1 comments

Hi,
The loss during AE pre-train become nan after the first 4000 iterations, here's the log i got:

000 epoch, 00000 iter, average time 7.0951, loss 37.2266
000 epoch, 02000 iter, average time 0.6359, loss -403.4386
000 epoch, 04000 iter, average time 0.6286, loss -801.8414
000 epoch, 06000 iter, average time 0.6607, loss nan
2021-03-01 00:56:42,407 - main - INFO - 000 epoch training, L_rec=nan, L_cls=nan, L_ae=0.0000, loss=nan

Is this ok or something went wrong?

The reason might be the gradient exploding caused by the semihard loss. After adding a gradient clipping the loss goes normal, but i'm not sure how this would affect the result.