hustvl/MIMDet

How to change the config to train Benchmarking-ViT-B with batch size 16 ?

Yingdong-Hu opened this issue · 2 comments

Hi, thanks for the great project!
How is this max_iter=184375 in Benchmarking-ViT-B calculated ? (num_images * epochs) / batch_size ?
I want to train a Benchmarking-ViT-B model with batch size 16 on 8-GPUs environment, but I am confused by the config file.
Could you advise how to adjust hyper-parameters like max_iter, eval_period if I change the batch size to 16?

train = dict(
output_dir="output/benchmarking_mask_rcnn_base_FPN_100ep_LSJ_mae",
init_checkpoint="",
max_iter=184375,
amp=dict(enabled=True), # options for Automatic Mixed Precision
ddp=dict( # options for DistributedDataParallel
broadcast_buffers=False, find_unused_parameters=False, fp16_compression=True,
),
checkpointer=dict(period=1844, max_to_keep=100), # options for PeriodicCheckpointer
eval_period=1844,
log_period=20,
device="cuda"
# ...
)

Hi @Alxead, thanks for your interest in our work.
You are right: max_iter = (num_images * epochs) / batch_size (num_images = ~118000 for COCO).

I suggest you change max_iter = 184375 * 4 & eval_period = 1844 * 4 if your bsz = 16.
Also, I recommend changing the lr = 4e-5.

But notice that this config cannot guarantee to re-produce the original accuracy.

I believe the issue at hand was addressed, as such I'm closing this. Feel free to ask if you have further questions.