Hyperparams for ImageNet
Closed this issue · 7 comments
michaelklachko commented
Hi, have you found a good set of hyperparameters for Imagenet?
narumiruna commented
Hi, I tried configs/imagenet.yaml
to train b0
and got 0.7465.
michaelklachko commented
Wow, that's almost 2% worse than B0 result in the paper! Have you tried other models (B3 or B4)?
narumiruna commented
No, I don't have enough resources to do that.
michaelklachko commented
How did you choose the params in that yaml file?
narumiruna commented
optimizer.lr = base_lr * batch_size / 256 = 0.1 * 32 / 256 = 0.0125
scheduler.gamma = 0.97 ^ (1 / 2.4) = 0.98738885893
Because I feel the model is under-fitting during training, so I set weight_decay
to 0.
michaelklachko commented
Why is your batch size so small? Is that per GPU? How many GPUs did you use?
narumiruna commented
I use 1 GPU to train the model, so it's per GPU.