hyz-xmaster/VarifocalNet

Using MMDet version of VFNet with the lastest backbone (e,g. Poolformer S36, ConvNeXt Small) with Inf Issues on Varifocal loss

cydiachen opened this issue · 1 comments

Thank you for your excellent work.
I am now experiment on improving VFNet with the latest model backbone. (e,g. Poolformer S36, ConvNeXt Small)
The network works fine on the first 5 epochs and suffer from significant performance drop caused by unexpected Inf value of cls_loss ( In my case is varifocal loss).
I am hoping for getting some advice for tracking the issue.
(I have tried grad_clip to clip gradient of Inf value, but it does not solve the issue)

Hi, if the first 5 epochs are warm-up epochs, you may set a lower learning rate. The 'Inf' value problem is possibly caused by some very large negative predictions, say -100000000, and this will lead to log(sigmoid(p)) -> Inf.