CUDA out of memory even with 8 GPUs

Question

CUDA out of memory even with 8 GPUs

Opened this issue 2 years ago · 6 comments

Hi, I have tried to train the model with your optimize_nha.py,
but CUDA out of memory came out sometimes in the beginning of training or after the 150 epoch.
I used gpus=8 in .ini file.
I have tried to reduce batch size but it didn't work. e.g. train_batch_size = [1, 1, 1]
The GPUs are Tesla V100-SXM2 16G.
Could you let me know what is the problem?
I think it is enough to train the code.

Answer 1 · 2022-06-09T08:31:27.000Z

You may change the texture mlp's model size to run, like 256 to 128. But I am not clear about the effect on the final result.

Answer 2 · 2022-06-12T15:31:07.000Z

@jkhong99 is it solved? I am also facing the same issue although I changed the texture mlp model size

Answer 3 · 2022-06-13T05:49:45.000Z

I used 3layers and 4 layers instead 6 and 8 layers. It was available.
I have not tried to change texture mlp dimension yet.

Answer 4 · 2022-06-14T19:28:39.000Z

Worked on my system as well after reducing the layers. But changing only texture mlp dimension doesn't work

Answer 5 · 2022-10-19T13:18:31.000Z

Hi, What type of machine do I need to obtain the sufficient results demonstrated in the paper? At least A100/V100? What if 1080Ti or 2080Ti?

Answer 6 · 2022-11-19T05:44:34.000Z

Hi, What type of machine do I need to obtain the sufficient results demonstrated in the paper? At least A100/V100? What if 1080Ti or 2080Ti?

No, It's not enough, you should use 3090TI or maybe better.