CUDA out of memory
Opened this issue · 6 comments
Impressive work!
While running the training code, I encountered a CUDA out of memory error.
Could you please advise on any settings that could reduce the memory requirements?
You can reduce the batch_size, default is 48.
I'm having the same issue, where the GPU memory keeps increasing until it exceeds it while training.
me too. During the training process, the GPU memory will become higher and higher, and no matter what batchsize. Is there too much useless information during training that is not cleaned up in time?
Hi, I just run a quick test on a RTX 4090 (24GB) GPU. However, during the whole training session, I didn't encounter the problem you mentioned. The original work was trained on One A100 GPU. During the whole session, the monitored memory consumption is around 27%-37% of the total GPU memory. Here is the hardware which I just have tested on:
Cuda 11.3
GPU: RTX 4090(24GB) * 1
CPU: 12 vCPU Intel(R) Xeon(R) Platinum 8352V CPU @ 2.10GHz
Memory: 90GB
@flww213 @leiershuai @ohhhh2022 I also see the increasing VRAM during training... do you figure out the reason?