CUDA out of memory when testing on cityspaces

Question

CUDA out of memory when testing on cityspaces

awfniewf opened this issue 2 years ago · 0 comments

When I perform the semantic segmentation test according to your method, the following errors are displayed：
//
(open-mmlab) liugengyuan@liugengyuan-Lenovo-Legion-R9000P2021H:~/mmsegmentation$ python -m torch.distributed.launch --nproc_per_node=1 tools/test.py configs/replknet/RepLKNet-31B_1Kpretrain_upernet_80k_cityscapes_769.py RepLKNet-31B_ImageNet-1K_UperNet_Cityscapes.pth --launcher pytorch --eval mIoU

"CLASSES" not found in meta, use dataset.CLASSES instead
"PALETTE" not found in meta, use dataset.PALETTE instead
[ ] 0/500, elapsed: 0s, ETA:/home/liugengyuan/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/utils/checkpoint.py:25: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
warnings.warn("None of the inputs have requires_grad=True. Gradients will be None")
/home/liugengyuan/mmsegmentation/mmseg/ops/wrappers.py:23: UserWarning: When align_corners=True, the output would more aligned if input size (6, 6) is x+1 and out size (33, 65) is nx+1
f'When align_corners={align_corners}, '
Traceback (most recent call last):
File "tools/test.py", line 320, in
main()
File "tools/test.py", line 297, in main
format_args=eval_kwargs)
File "/home/liugengyuan/mmsegmentation/mmseg/apis/test.py", line 208, in multi_gpu_test
result = model(return_loss=False, rescale=True, **data)
File "/home/liugengyuan/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/liugengyuan/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 886, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/liugengyuan/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/liugengyuan/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 110, in new_func
return old_func(*args, **kwargs)
File "/home/liugengyuan/mmsegmentation/mmseg/models/segmentors/base.py", line 110, in forward
return self.forward_test(img, img_metas, **kwargs)
File "/home/liugengyuan/mmsegmentation/mmseg/models/segmentors/base.py", line 92, in forward_test
return self.simple_test(imgs[0], img_metas[0], **kwargs)
File "/home/liugengyuan/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 262, in simple_test
seg_logit = self.inference(img, img_meta, rescale)
File "/home/liugengyuan/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 247, in inference
seg_logit = self.whole_inference(img, img_meta, rescale)
File "/home/liugengyuan/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 206, in whole_inference
seg_logit = self.encode_decode(img, img_meta)
File "/home/liugengyuan/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 74, in encode_decode
out = self._decode_head_forward_test(x, img_metas)
File "/home/liugengyuan/mmsegmentation/mmseg/models/segmentors/encoder_decoder.py", line 96, in _decode_head_forward_test
seg_logits = self.decode_head.forward_test(x, img_metas, self.test_cfg)
File "/home/liugengyuan/mmsegmentation/mmseg/models/decode_heads/decode_head.py", line 222, in forward_test
return self.forward(inputs)
File "/home/liugengyuan/mmsegmentation/mmseg/models/decode_heads/uper_head.py", line 138, in forward
output = self._forward_feature(inputs)
File "/home/liugengyuan/mmsegmentation/mmseg/models/decode_heads/uper_head.py", line 132, in _forward_feature
fpn_outs = torch.cat(fpn_outs, dim=1)
RuntimeError: CUDA out of memory. Tried to allocate 1.01 GiB (GPU 0; 5.78 GiB total capacity; 2.33 GiB already allocated; 578.75 MiB free; 3.34 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
//

I have just come into contact with deep learning, so I don't know much about something. I use a single GPU, is it because I don't have enough video memory? How can I solve this problem.Thank you!