Sunarker/Collaborative-Learning-for-Weakly-Supervised-Object-Detection

RuntimeError: CUDA error: out of memory

Closed this issue · 1 comments

Jngwl commented

Hello,Thanks for sharing your great works!
But when I trained the model with ./experiments/scripts/train.sh 0 pascal_voc vgg16 voc12_wsddn_pre, I encountered the error

Traceback (most recent call last):
  File "./tools/trainval_net.py", line 149, in <module>
    max_iters=args.max_iters)
  File "/home/amax/GWL/pytorch-faster-rcnn/tools/../lib/model/train_val.py", line 365, in train_net
    sw.train_model(max_iters)
  File "/home/amax/GWL/pytorch-faster-rcnn/tools/../lib/model/train_val.py", line 283, in train_model
    self.net.train_step(blobs, self.optimizer)
  File "/home/amax/GWL/pytorch-faster-rcnn/tools/../lib/nets/network.py", line 574, in train_step
    self._losses['total_loss'].backward()
  File "/home/amax/.local/lib/python2.7/site-packages/torch/tensor.py", line 93, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/amax/.local/lib/python2.7/site-packages/torch/autograd/__init__.py", line 90, in backward
    allow_unreachable=False)  # allow_unreachable flag
RuntimeError: CUDA error: out of memory
Command exited with non-zero status 1
13.82user 11.67system 0:13.76elapsed 185%CPU (0avgtext+0avgdata 3269864maxresident)k
464inputs+24outputs (0major+1764047minor)pagefaults 0swaps

My PC's details are as follows:

CPU:Intel® Xeon(R) CPU E5-1650 v4 @ 3.60GHz × 12 
GPU:GeForce GTX 1080 Ti/PCIe/SSE2
Memory: 32G
Disk:228G
Ubuntu:16.04LTS
Cuda:9.0.176
Cudnn:7.4.1
torch:0.4.1

Can you give me some advice?
Thank you

Thank you for your interest of our work! We trained our model on a TitanX GPU, the GPU memory used was about 8K MB. Also, the pytorch version followed 0.2 as in Readme file.