narumiruna/efficientnet-pytorch

Hyperparams for ImageNet

Closed this issue · 7 comments

Hi, have you found a good set of hyperparameters for Imagenet?

Hi, I tried configs/imagenet.yaml to train b0 and got 0.7465.

Wow, that's almost 2% worse than B0 result in the paper! Have you tried other models (B3 or B4)?

No, I don't have enough resources to do that.

How did you choose the params in that yaml file?

optimizer.lr = base_lr * batch_size / 256 = 0.1 * 32 / 256 = 0.0125
scheduler.gamma = 0.97 ^ (1 / 2.4) = 0.98738885893

Because I feel the model is under-fitting during training, so I set weight_decay to 0.

Why is your batch size so small? Is that per GPU? How many GPUs did you use?

I use 1 GPU to train the model, so it's per GPU.