Nan loss for ResNext backbone trained on cifar 100

Question

Nan loss for ResNext backbone trained on cifar 100

devavratTomar opened this issue a year ago · 1 comments

Thank you for your work. While trying your code for the Resnext backbone on cifar100, I get nan values for the training loss. As mentioned in the published paper, I use the initial learning rate of 0.1 for SGD with cosine scheduling.

Answer 1 · 2024-02-09T21:28:19.000Z

Yes, same here.
Could you please help with this?

Thanks,
Sara