Does this code need large GPU memory?
bqdeng opened this issue · 3 comments
Hello! During the training of cifar10 dataset, do you encounter that when the batchsize is set to 2048, you can't run on the dual card nvidia3090? Display memory overflow.
So I changed the batch size to 256, which is still a memory overflow.
Finally, I had no choice but to change it to 128 to run.
However, compared with simclr and swav codes, the batch size that can be set under the same device is not so small. I can generally run 2048 or 1024. Is this normal?
My device is nvidia3090, dual card, with 48g of running video memory. The training data set is cifar10
If you can easily answer, I will be very happy!
Hi,
I don't know for Cifar-10, but on ImageNet with images of size 3 * 224 * 224, a batch of size 2048 do not fit on a single GPU, but is distributed across 32 GPUs, the batch size for a single GPUs is then 64.
Given that Cifar images are much smaller you should be able to fit much bigger batches on your GPU, especially if you have 48G of memory.
I think that in that case the bottleneck might be the projector that has a lot of parameters. Can you try to run the code with --mlp 4096-128 instead of --mlp 8192-8192-8192 ?
Thank you for your answer.
Your intuition is right. I think I need to think it over again. Thank you for your reply.
If you can, please do not close this issue for the time being. In a few days, I'll share the results of running code on a small dataset. As the end of this question.
Thank you again for your great work and enthusiastic reply!
I am closing the issue, if you want to chat more you can contact me at abardes@fb.com