Learning Rate and Batch Size
FabianSchuetze opened this issue · 3 comments
Hi,
thanks for the fantastic work. I am currently trying to train the tiny model from the Imagenet-pretrained weights on the ADE dataset to begin integrating your work into mmsegmentation, as discussed here and here.
However, I am confused about the batch size and learning rate. In the paper, you mention a batch size of 16 and that you use 8 GPUS. However, the config sets samples_per_gpu
to 8. Can you kindly tell me what the total batch size used for training should be and the corresponding learning rate?
Best wishes & many thanks,
Fabian
Hi,
The total batch size is 16 for ADE20K dataset and we use 2 GPUs to train SegNeXt on this benchmark.
As @MenghaoGuo said, we train SegNeXt-tiny by 2 GPUs and SegNeXt -large by 4 GPUs. But their learning rate and batch size are the same. Because we don't have so many GPUs.
Thanks for the comments - They were very helpful to me. I will try another training run and report the results.