msight-tech/research-ms-loss

Loss is not stable and sometimes Nan

Closed this issue · 3 comments

I use resnet50 + MS loss to train my own dataset, but sometimes loss will be Nan, it seems that the loss is not very stable

Maybe the reason is the learn rate you use is two large, and the the exp part of the loss goes Nan.

I managed to get resent50 working. Check this out
chammika-become@d0c49eb

You need to download the official weights to ~/.torch/models/resnet50-19c8e357.pth
Run with configs/example_resnet50.yaml (note the std/mean is different) gave me R@1 65.1 on CUB.

@autocyz , the comments in this thread should answer your issue.