CUDA out of memory

Question

CUDA out of memory

Closed this issue a year ago · 1 comments

Hi there, well done with this excellent work.

I was trying to run the demo of the model and I encountered this problem:

Successfully load checkpoint from experiment/simple3dmesh_infer/baseline_mix/final.pth.tar.
===> Start inferencing...
0%| | 0/8 [00:03<?, ?it/s]
Traceback (most recent call last):
File "main/inference.py", line 355, in
main(args)
File "main/inference.py", line 348, in main
inferencer.infer(epoch=0)
File "main/inference.py", line 124, in infer
_, _, _, _, pred_mesh, _, pred_root_xy_img = self.model(imgs, inv_trans, intrinsic_param, pose_root, depth_factor, flip_item=None, flip_mask=None)
File "/home/innova/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/innova/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 159, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/innova/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/innova/environment/VirtualMarker/virtualmarker/models/simple3dmesh.py", line 62, in forward
pred_xyz_jts, confidence, pred_uvd_jts_flat, pred_root_xy_img = self.simple3dpose(x, trans_inv, intrinsic_param, joint_root, depth_factor, flip_item, flip_output, flip_mask) # (B, J+K, 3), (B, J+K)
File "/home/innova/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/innova/environment/VirtualMarker/virtualmarker/models/simple3dpose.py", line 202, in forward
out = norm_heatmap(self.norm_type, out)
File "/home/innova/environment/VirtualMarker/virtualmarker/models/simple3dpose.py", line 20, in norm_heatmap
heatmap = F.softmax(heatmap, 2)
File "/home/innova/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/functional.py", line 1512, in softmax
ret = input.softmax(dim)
RuntimeError: CUDA out of memory. Tried to allocate 2.53 GiB (GPU 0; 7.92 GiB total capacity; 5.50 GiB already allocated; 1.40 GiB free; 5.65 GiB reserved in total by PyTorch)

Would it be possible if you could assist me with this issue or give any advice or recommendations.

My GPU is a NVIDIA Corporation GP104GL [Quadro P4000] 8 GB

Many thanks

Answer 1 · 2023-06-22T11:27:51.000Z

Edit: I reduced batch size from default=32 to default =16 and it is now working.

Edited from Interence.py file
line 49: parser.add_argument('--batch_size', type=int, default=16, help='batch size for detection and motion capture')