facebookresearch/Mask2Former

No matter how many instances there are, the size of the model prediction output is always 100.

SEUZTh opened this issue · 7 comments

def run_on_image(self, image):
  ...
  predictions = self.predictor(image)
  ...

You can use confidence score to filter predictions.

You can use confidence score to filter predictions.

Thanks for your reply.
Detectron2 has the same problem. It has a parameter.

TEST:
  DETECTIONS_PER_IMAGE: 300

But I can't find the same parameter in Mask2Former.

hello @SEUZTh. I'm using video demo and made a bit of hacking on mask2former demo and detectron for binary mask saving.
Video demo is made for VIS, so there migh be a little difference. I would suggest you something like this:

        predictions = self.predictor(frames)
        thresholded_idxs = np.array(predictions["pred_scores"]) >= confidence_threshold

        image_size = predictions["image_size"]
        pred_scores = [predictions["pred_scores"][i] for i in thresholded_idxs]
        pred_labels = [predictions["pred_labels"][i] for i in thresholded_idxs]
        pred_masks = [predictions["pred_masks"][i] for i in thresholded_idxs]

Model output size is dependent on whole model architecture and how it's trained, where we always use 100 queries - some are just classified to non-class etc. Using similiar code you can filter out predictions with low scores.

hello @SEUZTh. I'm using video demo and made a bit of hacking on mask2former demo and detektron for binary mask saving. Video demo is made for VIS, so there migh be a little difference. I would suggest you something like this:

        predictions = self.predictor(frames)
        thresholded_idxs = np.array(predictions["pred_scores"]) >= confidence_threshold

        image_size = predictions["image_size"]
        pred_scores = [predictions["pred_scores"][i] for i in thresholded_idxs]
        pred_labels = [predictions["pred_labels"][i] for i in thresholded_idxs]
        pred_masks = [predictions["pred_masks"][i] for i in thresholded_idxs]

Model output size is dependent on whole model architecture and how it's trained, where we always use 100 queries - some are just classified to non-class etc. Using similiar code you can filter out predictions with low scores.

Thanks for your reply. I need to predict more than 100 instances in one picture. The larger size of predictions is needed.

Hi
I have the same issue. Any updates?

Thanks in advance

@AhmadZobairSurosh @SEUZTh Trying on the custom dataset, training is working fine but during the test/evaluation, it gives me this error. DefaultCPUAllocator: can't allocate memory: you tried to allocate

Is it because of a single GPU? I tried reducing batch size IMS_PER_BATCH but nothing helped. Please look into it.

weight_dict: {'loss_ce': 2.0, 'loss_mask': 5.0, 'loss_dice': 5.0, 'loss_ce_0': 2.0, 'loss_mask_0': 5.0, 'loss_dice_0': 5.0, 'loss_ce_1': 2.0, 'loss_mask_1': 5.0, 'loss_dice_1': 5.0, 'loss_ce_2': 2.0, 'loss_mask_2': 5.0, 'loss_dice_2': 5.0, 'loss_ce_3': 2.0, 'loss_mask_3': 5.0, 'loss_dice_3': 5.0, 'loss_ce_4': 2.0, 'loss_mask_4': 5.0, 'loss_dice_4': 5.0, 'loss_ce_5': 2.0, 'loss_mask_5': 5.0, 'loss_dice_5': 5.0, 'loss_ce_6': 2.0, 'loss_mask_6': 5.0, 'loss_dice_6': 5.0, 'loss_ce_7': 2.0, 'loss_mask_7': 5.0, 'loss_dice_7': 5.0, 'loss_ce_8': 2.0, 'loss_mask_8': 5.0, 'loss_dice_8': 5.0} num_classes: 80 eos_coef: 0.1 num_points: 12544 oversample_ratio: 3.0 importance_sample_ratio: 0.75 )> to CPU due to CUDA OOM ERROR [03/22 09:07:53 d2.engine.train_loop]: Exception during training: Traceback (most recent call last): File "/home/ec2-user/detectron2/detectron2/engine/train_loop.py", line 156, in train self.after_step() File "/home/ec2-user/detectron2/detectron2/engine/train_loop.py", line 190, in after_step h.after_step() File "/home/ec2-user/detectron2/detectron2/engine/hooks.py", line 556, in after_step self._do_eval() File "/home/ec2-user/detectron2/detectron2/engine/hooks.py", line 529, in _do_eval results = self._func() File "/home/ec2-user/detectron2/detectron2/engine/defaults.py", line 453, in test_and_save_results self._last_eval_results = self.test(self.cfg, self.model) File "/home/ec2-user/detectron2/detectron2/engine/defaults.py", line 617, in test results_i = inference_on_dataset(model, data_loader, evaluator) File "/home/ec2-user/detectron2/detectron2/evaluation/evaluator.py", line 158, in inference_on_dataset outputs = model(inputs) File "/home/ec2-user/anaconda3/envs/mask2former/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/home/ec2-user/Mask2Former/mask2former/maskformer_model.py", line 259, in forward instance_r = retry_if_cuda_oom(self.instance_inference)(mask_cls_result, mask_pred_result) File "/home/ec2-user/detectron2/detectron2/utils/memory.py", line 82, in wrapped return func(*new_args, **new_kwargs) File "/home/ec2-user/Mask2Former/mask2former/maskformer_model.py", line 377, in instance_inference mask_scores_per_image = (mask_pred.sigmoid().flatten(1) * result.pred_masks.flatten(1)).sum(1) / (result.pred_masks.flatten(1).sum(1) + 1e-6) RuntimeError: [enforce fail at CPUAllocator.cpp:71] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 3077222400 bytes. Error code 12 (Cannot allocate memory) [03/22 09:07:53 d2.engine.hooks]: Overall training speed: 47 iterations in 0:00:25 (0.5436 s / it) [03/22 09:07:53 d2.engine.hooks]: Total training time: 0:00:50 (0:00:24 on hooks) [03/22 09:07:53 d2.utils.events]: eta: 0:07:39 iter: 49 total_loss: 66.03 loss_ce: 1.179 loss_mask: 0.7194 loss_dice: 4.564 loss_ce_0: 3.898 loss_mask_0: 0.6054 loss_dice_0: 4.574 loss_ce_1: 0.02219 loss_mask_1: 0.7182 loss_dice_1: 4.581 loss_ce_2: 1.177 loss_mask_2: 0.7029 loss_dice_2: 4.579 loss_ce_3: 1.179 loss_mask_3: 0.7023 loss_dice_3: 4.568 loss_ce_4: 1.175 loss_mask_4: 0.6957 loss_dice_4: 4.57 loss_ce_5: 1.178 loss_mask_5: 0.6936 loss_dice_5: 4.58 loss_ce_6: 1.169 loss_mask_6: 0.7088 loss_dice_6: 4.57 loss_ce_7: 1.172 loss_mask_7: 0.7256 loss_dice_7: 4.559 loss_ce_8: 1.171 loss_mask_8: 0.722 loss_dice_8: 4.565 time: 0.5322 last_time: 0.5565 data_time: 0.0033 last_data_time: 0.0031 lr: 0.0025 max_mem: 11326M Traceback (most recent call last): File "train.py", line 422, in <module> launch( File "/home/ec2-user/detectron2/detectron2/engine/launch.py", line 84, in launch main_func(*args) File "train.py", line 416, in main return trainer.train() File "/home/ec2-user/detectron2/detectron2/engine/defaults.py", line 484, in train super().train(self.start_iter, self.max_iter) File "/home/ec2-user/detectron2/detectron2/engine/train_loop.py", line 156, in train self.after_step() File "/home/ec2-user/detectron2/detectron2/engine/train_loop.py", line 190, in after_step h.after_step() File "/home/ec2-user/detectron2/detectron2/engine/hooks.py", line 556, in after_step self._do_eval() File "/home/ec2-user/detectron2/detectron2/engine/hooks.py", line 529, in _do_eval results = self._func() File "/home/ec2-user/detectron2/detectron2/engine/defaults.py", line 453, in test_and_save_results self._last_eval_results = self.test(self.cfg, self.model) File "/home/ec2-user/detectron2/detectron2/engine/defaults.py", line 617, in test results_i = inference_on_dataset(model, data_loader, evaluator) File "/home/ec2-user/detectron2/detectron2/evaluation/evaluator.py", line 158, in inference_on_dataset outputs = model(inputs) File "/home/ec2-user/anaconda3/envs/mask2former/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "/home/ec2-user/Mask2Former/mask2former/maskformer_model.py", line 259, in forward instance_r = retry_if_cuda_oom(self.instance_inference)(mask_cls_result, mask_pred_result) File "/home/ec2-user/detectron2/detectron2/utils/memory.py", line 82, in wrapped return func(*new_args, **new_kwargs) File "/home/ec2-user/Mask2Former/mask2former/maskformer_model.py", line 377, in instance_inference mask_scores_per_image = (mask_pred.sigmoid().flatten(1) * result.pred_masks.flatten(1)).sum(1) / (result.pred_masks.flatten(1).sum(1) + 1e-6) RuntimeError: [enforce fail at CPUAllocator.cpp:71] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 3077222400 bytes. Error code 12 (Cannot allocate memory)