rwightman/efficientdet-pytorch

[BUG] BBoxes not clipped or removed in RandomResizePad and ResizePad

mkmenta opened this issue · 4 comments

First of all, thanks for your hard and great work!

Describe the bug
I think that the bounding boxes are not clipped or removed correctly if their coordinates come out from the right and bottom edges in the data transformations of RandomResizePad and ResizePad.

To Reproduce

import random

import numpy as np
from PIL import Image

from effdet.data import RandomResizePad

img = Image.new("RGB", (1000, 1500), color=(0, 0, 0))
target = {
    "bbox": np.array([[649, 349, 1400, 703],
                      [1400, 270, 1480, 434]], dtype=np.float64),
    "cls": np.array([1, 2])
}

random.seed(0)
rrp = RandomResizePad(target_size=512, scale=(1.8, 1.8))
img_t, target_t = rrp(img, target)

print(target_t['bbox'])

the code above outputs:

[[ 88.7456 172.4256 550.16   389.9232]
 [550.16   123.888  599.312  224.6496]]

which, if I'm not wrong, shouldn't be correct: we have set the target_size=512 and the first BBox y2 coordinate is 550.16. The same happens with the second BBox that has its coordinates out of the 512x512 image.

Expected behavior
The output of that code I think that should be:

[[ 88.7456 172.4256 512.     389.9232]]

(first bbox clipped and second bbox removed).

Screenshots
A visualization of what I'm saying.
Before: before After: after

Desktop:

Suggested fix
Changing the lines 100 and 162 of transforms.py from:

clip_boxes_(bbox, (scaled_h, scaled_w))

to

clip_boxes_(bbox, self.target_size)

the code I wrote to reproduce the issue outputs:

[[ 88.7456 172.4256 512.     389.9232]]

Visualization of the transformed image:
after_new

Thank you in advance!

@mkmenta thanks, looks like a potential issue, I'll dig in more over next few days

Thinking it should be as per below, to ensure clipping to either target image bounds or letterboxing, whichever is smaller.

clip_boxes_(bbox, (min(scaled_h, self.target_size[0]), min(scaled_w, self.target_size[1])))

That's true! Sorry, I missed that.

@mkmenta I'm testing #186 in training