dragonlzm/ISAL

CIFAR10 test_calc_val_grad.py ,grad_result = grad(loss, params),cuda out of memory

Closed this issue · 1 comments

I have a GPU NVIDIA RTX3060 12G,When the code runs to grad_result = grad(loss, params) in classification test_calc_val_grad.py, error: cuda out of memory.but the memory of NIVIDIA TITAN Xp in your paper is also 12G, I've been debugging code on windows and ubuntu 18.04 for a long time,but I can't sovle it.

You can try to enlarge the batch size since for each batch it needs to save a tensor that contains the gradient of each parameter of the model. So when your batch size is small, it needs to save more tensor in GPU. When your validation set is large, it will run out of your memory.
Enlarging the batch size will not change the final result, since we will average over the gradient on all batches.
Hope this can help you