RuntimeError: CUDA out of memory.
KomputerMaster64 opened this issue · 3 comments
KomputerMaster64 commented
I have tried sampling/evaluating/testing the model on colab as well as local gpu node, however I am facing the CUDA out of memory error.
Error on google colab
Traceback (most recent call last):
File "test_ddgan.py", line 272, in <module>
sample_and_test(args)
File "test_ddgan.py", line 186, in sample_and_test
fake_sample = sample_from_model(pos_coeff, netG, args.num_timesteps, x_t_1,T, args)
File "test_ddgan.py", line 123, in sample_from_model
x_0 = generator(x, t_time, latent_z)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/content/drive/MyDrive/Repositories/denoising-diffusion-gan/score_sde/models/ncsnpp_generator_adagn.py", line 322, in forward
h = modules[m_idx](hs[-1], temb, zemb)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/content/drive/MyDrive/Repositories/denoising-diffusion-gan/score_sde/models/layerspp.py", line 300, in forward
h = self.act(self.GroupNorm_1(h, zemb))
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/content/drive/MyDrive/Repositories/denoising-diffusion-gan/score_sde/models/layerspp.py", line 61, in forward
out = gamma * out + beta
RuntimeError: CUDA out of memory. Tried to allocate 3.12 GiB (GPU 0; 14.76 GiB total capacity; 12.95 GiB already allocated; 887.75 MiB free; 12.95 GiB reserved in total by PyTorch)
Error on GPU node:
Traceback (most recent call last):
File "test_ddgan.py", line 272, in <module>
sample_and_test(args)
File "test_ddgan.py", line 186, in sample_and_test
fake_sample = sample_from_model(pos_coeff, netG, args.num_timesteps, x_t_1,T, args)
File "test_ddgan.py", line 123, in sample_from_model
x_0 = generator(x, t_time, latent_z)
File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/manisha.padala/gan/denoising-diffusion-gan/score_sde/models/ncsnpp_generator_adagn.py", line 322, in forward
h = modules[m_idx](hs[-1], temb, zemb)
File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/manisha.padala/gan/denoising-diffusion-gan/score_sde/models/layerspp.py", line 279, in forward
h = self.act(self.GroupNorm_0(x, zemb))
File "/home/manisha.padala/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/manisha.padala/gan/denoising-diffusion-gan/score_sde/models/layerspp.py", line 61, in forward
out = gamma * out + beta
RuntimeError: CUDA out of memory. Tried to allocate 3.12 GiB (GPU 0; 10.76 GiB total capacity; 6.70 GiB already allocated; 3.06 GiB free; 6.70 GiB reserved in total by PyTorch)
In both the cases the system could't somehow allocate 3.12 GiB
VedantDere0104 commented
set --batch_size 100 in test_ddgan.py
KomputerMaster64 commented
Thank you for the tip.
I wanted to know how to generate more images (to have unique 100 or 200 images), given that the running the test file prints out the same images for the given batch size
VedantDere0104 commented
denoising-diffusion-gan/test_ddgan.py
Line 131 in 6818ded
remove the seed