TencentARC/VQFR

Restoration Training. CUDA out of memory

MDYLL opened this issue · 2 comments

MDYLL commented

Hi !
Results from inference are amaizing !!!
I have experience with your GFPGAN and I'd like to train VQFR.
I have two GPU's with 12Gb memory. During start training I get error: "CUDA out of memory".
I tried change "batch_size_per_gpu" to 1 and "--nproc_per_node" to 1, but I got same error :(
Could you help me with requirements to GPU ? Do I need NVIDIA A100 (like in https://arxiv.org/pdf/2205.06803.pdf) or may be I can train on my GPU's ?
Thank you

Sorry, the VQGAN training can run on 24G GPU and the restoration training need to run on 40G GPU. Maybe you can try to reduce the "base_channels" from 128 to 48, or reduce "layers" to see whether it can run on your GPU. By changing the default configuration for restoration, you need to re-train the VQGAN with similar changes. Thanks for using our code.

@guyuchao So the minimal reqiurement for Single GPU is 40G? May i ask why it takes so much mem? Thanks!