Validation accuracy keeps to be 0.09% during training
edizhuang opened this issue · 3 comments
Dear authors,
I'm interested in your paper and perfom training from scratch on ImageNet. However, the validation accuracy keeps to be * Acc@1 0.090 during training.
Do you have any idea why this happens? I train Swin Transformer, it works.
I use Pytorch 1.7.1 and 1.6.0, no mixed precision, 100 epochs.
--amp-opt-level O0 --output ./output --opts TRAIN.EPOCHS 100
Thanks,
Eddie
Thanks for your feedback.
It is recommended to use mixed precision and train the model for 300 epochs, i.e., --amp-opt-level native --opts TRAIN.EPOCHS 300
(It looks that there are only these two differences between your training command and ours).
You may wish to provide more details about the exact command you run and the training log, and maybe I could give more advice with more details :)
Hi cheerss,
It does work with --amp-opt-level native. The first epoch is * Acc@1 1.488.
It is so wired that only mixed precision works. Just let you know.
Thanks,
Eddie
It seems that there is a bug when training in O0
mode. We have fixed it and the program works well, now.
Thanks for your feedback.