"out of memory" when in eval() mode
ahmed-shariff opened this issue · 1 comments
ahmed-shariff commented
I am using the model to test it on some of my own images, I am trying to use the model by importing it as a module. When I set the model to eval
mode, I get the following:
THCudaCheck FAIL file=/home/amsha/builds/pytorch/aten/src/THC/generic/THCStorage.cu line=58 error=2 : out of memory
Traceback (most recent call last):
File "pipeline.py", line 601, in <module>
main(parser.parse_args())
File "pipeline.py", line 594, in main
_main()
File "pipeline.py", line 141, in _main
train_output = current_model.train_model(dataloader.get_train_input_fn(), classification_steps)
File "/home/amsha/Research/FoodClassification/models/models_fasterrcnn.py", line 288, in train_model
rois_label = self.model(input_var, iinfo_var, gtbox_var, nmbox_var)
File "/home/amsha/virtualenv/torch-master-13022018/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in __call__
result = self.forward(*input, **kwargs)
File "/home/amsha/Research/faster-rcnn.pytorch/lib/model/faster_rcnn/faster_rcnn.py", line 77, in forward
pooled_feat = self.RCNN_roi_crop(base_feat, Variable(grid_yx).detach())
File "/home/amsha/virtualenv/torch-master-13022018/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in __call__
result = self.forward(*input, **kwargs)
File "/home/amsha/Research/faster-rcnn.pytorch/lib/model/roi_crop/modules/roi_crop.py", line 8, in forward
return RoICropFunction()(input1, input2)
File "/home/amsha/Research/faster-rcnn.pytorch/lib/model/roi_crop/functions/roi_crop.py", line 11, in forward
output = input2.new(input2.size()[0], input1.size()[1], input2.size()[1], input2.size()[2]).zero_()
RuntimeError: cuda runtime error (2) : out of memory at /home/amsha/builds/pytorch/aten/src/THC/generic/THCStorage.cu:58
The code block where this originates:
self.model = resnet([0,1], 50)
self.model.create_architecture()
...
# When I use model.train(False) or model.eval(), I get the the above error.
# The problem doesn't happen when the following is model.train()
self.model.train(False)
for root_dir, dirs, files in os.walk("../Datasets/captured/processed_v2/"):
for f in files:
img = io.imread(os.path.join(root_dir, f))
i = [torchvision.transforms.ToTensor()(img).unsqueeze(0), None ,torch.Tensor(0), torch.Tensor(img.shape[:2]).unsqueeze(0), torch.Tensor([[[1,2,3,4,1]]])]
if self.use_cuda:
i = [i_.cuda() if i_ is not None else i_ for i_ in i ]
input_var = torch.autograd.Variable(i[0].float())
nmbox_var = torch.autograd.Variable(i[2])
iinfo_var = torch.autograd.Variable(i[3])
gtbox_var = torch.autograd.Variable(i[4])
rois, cls_prob, bbox_pred, \
rpn_loss_cls, rpn_loss_bbox, \
RCNN_loss_cls, RCNN_loss_bbox, \
rois_label = self.model(input_var, iinfo_var, gtbox_var, nmbox_var)
print(rpn_loss_cls, rpn_loss_bbox, \
RCNN_loss_cls, RCNN_loss_bbox,)
continue
I am using the resnet50 in the model.
pytorch version: '0.4.0a0+b608ea9'
faster-rcnn_pytorch from: 28ee76d6ae868ca43c4e38bedbafd82d919f601a
GPU: GTX 1050
ahmed-shariff commented
This is embarrassing! This is the wrong repo. So sorry mate!