Batch size changes for memory crashing
mcianfrocco opened this issue · 2 comments
Hello -
I can't seem to find what batch sizes I should try when I crash the memory of my GPUs. I have a 512x512x512 volume that is crashing K80 GPUs and I'm trying to change the batch size, but I'm unsure what values I should be trying.
Thanks!
Mike
Dear Mike,
By default we use a batch of size 6, so if you are getting out of memory error, i would try first using batch size 2 and if problems remain, finally 1.
If still getting the out of memory error, the problem is probably caused not by the batch size but for some zombie process that remains in the gpu and has eaten the memory.
Use nvidia-smi to monitor the memory and if you find it is not empty before execution, please, kill the zombie.
I hope this helps you.
Ruben
Hi Ruben,
Thank you for the information - it helps a lot. I'd recommend adding it to README so that others know how to modulate the batch size. For what it's worth, switching to batch size of 5 allowed our job to finish on K80 GPUs.
Mike