Loss nan

Question

Loss nan

Opened this issue 9 months ago · 3 comments

Hello, I am currently training your model on my own data. However, when I trained the model, I got Loss=nan in both STN and CAE, I have tried different learning rate, but it still exists. Do you know any reasons that may cause this problem? Thank you very much!

epoch: 1534
loss: nan
lr : [5e-05]
epoch: 1535
loss: nan
lr : [5e-05]
epoch: 1536
loss: nan
lr : [5e-05]
epoch: 1537
loss: nan

Answer 1 · 2024-01-26T09:37:22.000Z

Hi,
I highly recommend saving the STN alignment result every ~100 epochs to make sure that the STN training did not result in taking all of the images outside of the alignment frame. The loss and training scheme was developed to avoid this scenario but it still can happen in challenging scenes.

Answer 2 · 2024-01-26T18:22:05.000Z

Thanks for your answer. My video is a continuous video shot by a moving camera, but the camera movement is relatively large and the speed is relatively fast. I am curious: Is this problem caused by the background changing too much?

Answer 3 · 2024-01-27T09:02:52.000Z

In the paper you can find an example of a sequence demonstrating large camera movement called "continuousPan" from the Cdnet 2014 dataset. I will advise you to enable a large alignment frame if you know in advance that you are expecting a large camera movement. The rate of the change should not have any effect on the results as the model treat the sequence as an unordered set of images.