swathikirans/GSM

How does the batch_size parameter influence the accuracy

Opened this issue · 3 comments

Hi, I try to train the model on my computer on Diving48 dataset. But if I set the batch_size 16, the training program will be out of my GPU memory. So I try to train the model with batch_size 14 and I get 27.02 classAccuracy, 34.49 Prec@1 and 62.69 Prec@5, which are lower than the result in your paper. Is there something wrong with my training setting or is the batch_size parameter so important.

Training setting:
python main.py diving48 RGB --arch InceptionV3 --num_segments 16 --consensus_type avg --batch-size 14 --iter_size 2 --dropout 0.5 --lr 0.01 --warmup 10 --epochs 60 --eval-freq 5 --gd 20 --run_iter 1 -j 16 --npb --gsm

Testing setting:
python test_models.py diving48 RGB model/diving48_InceptionV3_avg_segment16_batch14_epochs60_best.pth.tar --arch InceptionV3 --crop_fusion_type avg --test_segments 16  --test_crops 1 --num_clips 2 --gsm --save_scores

For Diving48, we used a batch size of 8, trained for 20 epochs with a dropout of 0.7 (see Sec. 4.2 in the paper). If you are still getting a lower accuracy, train the model for 30 epochs.

python main.py diving48 RGB --arch InceptionV3 --num_segments 16 --consensus_type avg --batch-size 8 --iter_size 2 --dropout 0.7 --lr 0.01 --warmup 10 --epochs 20 --eval-freq 5 --gd 20 --run_iter 1 -j 16 --npb --gsm

So, is this OK?

Use the following setting:

python main.py diving48 RGB --arch InceptionV3 --num_segments 16 --consensus_type avg --batch-size 8 --iter_size 1 --dropout 0.7 --lr 0.01 --warmup 10 --epochs 20 --eval-freq 5 --gd 20 --run_iter 1 -j 16 --npb --gsm