Longer training schedule?

Question

Longer training schedule?

JUGGHM opened this issue a year ago · 2 comments

Hi Assran, thank you for your greatTTTTTTTTT work! I wonder if using a longer pre-training schedule (800/1200/1600 epochs), how much performance superiority can we get upon previous methods like MAE?

Answer 1 · 2023-06-16T03:02:27.000Z

Hi Assran, thank you for your greatTTTTTTTTT work! I wonder if using a longer pre-training schedule (800/1200/1600 epochs), how much performance superiority can we get upon previous methods like MAE?

Especially for full fine-tuning settings

Answer 2 · 2023-06-16T20:52:13.000Z

Hi @JUGGHM, with a small image resolution (e.g., 224x224) we didn't see a huge advantage for longer training on IN1k to justify the additional compute, but I think longer training with a bigger resolution (e.g., 448x448) or with bigger datasets should lead to non-trivial performance improvements.

I'll close this issue for now, but let me know if you get the chance to try this out!