ERROR: Default GPU_MEM_LIMIT in mask_ops.py is too small; try increasing it

Question

ERROR: Default GPU_MEM_LIMIT in mask_ops.py is too small; try increasing it

drunkpig opened this issue 2 months ago · 1 comments

2024-07-29 11:10:25.818 | INFO     | __main__:<module>:292 - => processing s3 pdf: s3://files/8589939000-8589939999/[图解奇门遁甲大全(第1部)：吉凶占断].唐颐.扫描版.pdf，总页面数目:599
2024-07-29 11:10:31.261 | ERROR    | __main__:<module>:420 - Default GPU_MEM_LIMIT in mask_ops.py is too small; try increasing it
Traceback (most recent call last):

> File "/root/project/doc-pipeline/main.py", line 309, in <module>
    layout_res = layout_model(image, ignore_catids=[])
                 │            └ array([[[254, 254, 254],
                 │                      [254, 254, 254],
                 │                      [254, 254, 254],
                 │                      ...,
                 │                      [254, 254, 254],
                 │                      [254...
                 └ <modules.layoutlmv3.model_init.Layoutlmv3_Predictor object at 0x7fe0d1c8ffa0>

  File "/root/project/doc-pipeline/modules/layoutlmv3/model_init.py", line 124, in __call__
    outputs = self.predictor(image)
              │    │         └ array([[[254, 254, 254],
              │    │                   [254, 254, 254],
              │    │                   [254, 254, 254],
              │    │                   ...,
              │    │                   [254, 254, 254],
              │    │                   [254...
              │    └ <detectron2.engine.defaults.DefaultPredictor object at 0x7fe0cdeaf670>
              └ <modules.layoutlmv3.model_init.Layoutlmv3_Predictor object at 0x7fe0d1c8ffa0>

  File "/opt/conda/envs/pdf/lib/python3.10/site-packages/detectron2/engine/defaults.py", line 317, in __call__
    predictions = self.model([inputs])[0]
                  │    │      └ {'image': tensor([[[254., 254., 254.,  ..., 254., 254., 254.],
                  │    │                 [254., 254., 254.,  ..., 254., 254., 254.],
                  │    │                 ...
                  │    └ VLGeneralizedRCNN(
                  │        (backbone): FPN(
                  │          (fpn_lateral2): Conv2d(768, 256, kernel_size=(1, 1), stride=(1, 1))
                  │          (fpn_output...
                  └ <detectron2.engine.defaults.DefaultPredictor object at 0x7fe0cdeaf670>
  File "/opt/conda/envs/pdf/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           │    │           │       └ {}
           │    │           └ ([{'image': tensor([[[254., 254., 254.,  ..., 254., 254., 254.],
           │    │                      [254., 254., 254.,  ..., 254., 254., 254.],
           │    │                    ...
           │    └ <function Module._call_impl at 0x7fe1bbf66dd0>
           └ VLGeneralizedRCNN(
               (backbone): FPN(
                 (fpn_lateral2): Conv2d(768, 256, kernel_size=(1, 1), stride=(1, 1))
                 (fpn_output...
  File "/opt/conda/envs/pdf/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           │             │       └ {}
           │             └ ([{'image': tensor([[[254., 254., 254.,  ..., 254., 254., 254.],
           │                        [254., 254., 254.,  ..., 254., 254., 254.],
           │                      ...
           └ <bound method VLGeneralizedRCNN.forward of VLGeneralizedRCNN(
               (backbone): FPN(
                 (fpn_lateral2): Conv2d(768, 256, kernel_...

  File "/root/project/doc-pipeline/modules/layoutlmv3/rcnn_vl.py", line 55, in forward
    return self.inference(batched_inputs)
           │    │         └ [{'image': tensor([[[254., 254., 254.,  ..., 254., 254., 254.],
           │    │                    [254., 254., 254.,  ..., 254., 254., 254.],
           │    │                   ...
           │    └ <function VLGeneralizedRCNN.inference at 0x7fe0d218de10>
           └ VLGeneralizedRCNN(
               (backbone): FPN(
                 (fpn_lateral2): Conv2d(768, 256, kernel_size=(1, 1), stride=(1, 1))
                 (fpn_output...

  File "/root/project/doc-pipeline/modules/layoutlmv3/rcnn_vl.py", line 129, in inference
    return GeneralizedRCNN._postprocess(results, batched_inputs, images.image_sizes)
           │               │            │        │               │      └ [(1157, 800)]
           │               │            │        │               └ <detectron2.structures.image_list.ImageList object at 0x7fe0cdcaada0>
           │               │            │        └ [{'image': tensor([[[254., 254., 254.,  ..., 254., 254., 254.],
           │               │            │                   [254., 254., 254.,  ..., 254., 254., 254.],
           │               │            │                  ...
           │               │            └ [Instances(num_instances=12, image_height=1157, image_width=800, fields=[pred_boxes: Boxes(tensor([[ 4759.1147, 14032.8750, 1...
           │               └ <staticmethod(<function GeneralizedRCNN._postprocess at 0x7fe0d1fff250>)>
           └ <class 'detectron2.modeling.meta_arch.rcnn.GeneralizedRCNN'>

  File "/opt/conda/envs/pdf/lib/python3.10/site-packages/detectron2/modeling/meta_arch/rcnn.py", line 241, in _postprocess
    r = detector_postprocess(results_per_image, height, width)
        │                    │                  │       └ 15065
        │                    │                  └ 21790
        │                    └ Instances(num_instances=12, image_height=1157, image_width=800, fields=[pred_boxes: Boxes(tensor([[ 4759.1147, 14032.8750, 10...
        └ <function detector_postprocess at 0x7fe0d21b5630>
  File "/opt/conda/envs/pdf/lib/python3.10/site-packages/detectron2/modeling/postprocessing.py", line 67, in detector_postprocess
    results.pred_masks = roi_masks.to_bitmasks(
    │                    │         └ <function ROIMasks.to_bitmasks at 0x7fe0d2767d90>
    │                    └ ROIMasks(num_instances=12)
    └ Instances(num_instances=12, image_height=21790, image_width=15065, fields=[pred_boxes: Boxes(tensor([[ 4759.1147, 14032.8750,...
  File "/opt/conda/envs/pdf/lib/python3.10/site-packages/detectron2/structures/masks.py", line 526, in to_bitmasks
    bitmasks = paste(
               └ <function paste_masks_in_image at 0x7fe0710a69e0>
  File "/opt/conda/envs/pdf/lib/python3.10/site-packages/detectron2/utils/memory.py", line 70, in wrapped
    return func(*args, **kwargs)
           │     │       └ {'threshold': 0.5}
           │     └ (tensor([[[0.9995, 0.9997, 0.9999,  ..., 0.9999, 0.9999, 0.9996],
           │                [1.0000, 1.0000, 1.0000,  ..., 1.0000, 1.0000, 0.9...
           └ <function paste_masks_in_image at 0x7fe0d2764310>
  File "/opt/conda/envs/pdf/lib/python3.10/site-packages/detectron2/layers/mask_ops.py", line 125, in paste_masks_in_image
    num_chunks <= N
    │             └ 12
    └ 15

AssertionError: Default GPU_MEM_LIMIT in mask_ops.py is too small; try increasing it

Answer 1 · 2024-07-29T08:01:42.000Z

there is a gpu memory limitation in detectron2, try enlarge it: