Hyperparams for ImageNet

Question

Closed this issue 5 years ago · 7 comments

Hi, have you found a good set of hyperparameters for Imagenet?

Answer 1 · 2019-12-04T08:11:10.000Z

Hi, I tried configs/imagenet.yaml to train b0 and got 0.7465.

Answer 2 · 2019-12-04T08:24:22.000Z

Wow, that's almost 2% worse than B0 result in the paper! Have you tried other models (B3 or B4)?

Answer 3 · 2019-12-04T08:26:38.000Z

No, I don't have enough resources to do that.

Answer 4 · 2019-12-04T08:28:40.000Z

How did you choose the params in that yaml file?

Answer 5 · 2019-12-04T08:59:34.000Z

optimizer.lr = base_lr * batch_size / 256 = 0.1 * 32 / 256 = 0.0125
scheduler.gamma = 0.97 ^ (1 / 2.4) = 0.98738885893

Because I feel the model is under-fitting during training, so I set weight_decay to 0.

Answer 6 · 2019-12-04T09:03:42.000Z

Why is your batch size so small? Is that per GPU? How many GPUs did you use?

Answer 7 · 2019-12-04T09:13:41.000Z

I use 1 GPU to train the model, so it's per GPU.