gabeur/mmt

How to speed up the training process?

sqiangcao99 opened this issue · 2 comments

Hi. Thank you for generously sharing your work. When I trained the model on MSRVTT with 1 V100, I found the GPU-Util cannot reach 100%(about 60%). Do you have some tips? Thank you.

In order to speed up the training, you could increase the batch size until filling up the whole GPU memory.
However, I found out that a larger batch size was causing more overfitting, so I only recommend it for training on a large dataset (like HowTo100M).
Probably it is possible to mitigate the overfitting problem with more regularisation or a different learning rate decay, but I did not experiment with that, sorry.

Thank you for your quick response.