rsanchezgarc/deepEMhancer

Batch size changes for memory crashing

mcianfrocco opened this issue · 2 comments

Hello -

I can't seem to find what batch sizes I should try when I crash the memory of my GPUs. I have a 512x512x512 volume that is crashing K80 GPUs and I'm trying to change the batch size, but I'm unsure what values I should be trying.

Thanks!
Mike

Dear Mike,

By default we use a batch of size 6, so if you are getting out of memory error, i would try first using batch size 2 and if problems remain, finally 1.
If still getting the out of memory error, the problem is probably caused not by the batch size but for some zombie process that remains in the gpu and has eaten the memory.
Use nvidia-smi to monitor the memory and if you find it is not empty before execution, please, kill the zombie.

I hope this helps you.

Ruben

Hi Ruben,

Thank you for the information - it helps a lot. I'd recommend adding it to README so that others know how to modulate the batch size. For what it's worth, switching to batch size of 5 allowed our job to finish on K80 GPUs.

Mike