microsoft/VideoX

[X-CLIP]About training time.

yusq45 opened this issue · 1 comments

Thanks for your great works!
The problem is: Although I use ssd, I still need to spend 2 hours training an epoch for ViT-B/32.
I saw that you only spent 7 minutes training an epoch. Pointing out that my GPU usage is 0 most of the time.

nbl97 commented

Thanks for your interest. First, the ViT-B/32 was trained with 32 V100 GPUs. Then, pls check the time of loading data which may reduce the GPU utilization. Besides, we pre-cut the short side of videos to 256px for saving and fast reading, but I'm not sure how much of a speed gain this gives. Last, if you used the tar format, pls make sure you are just packing but not compressing the data. Hope this can help you.