keras-team/keras-cv

Wrong bounding boxes in the visualization of `tfds.datasets.kitti`

Closed this issue · 2 comments

Current Behavior:

Before testing object detection pipelines with Kitti dataset from tfds I wanted to try keras_cv visualization tools.

I used keras_cv.visualization.plot_bounding_box_gallery() and tried all Bounding Box Formats supported by KerasCV: 1. CENTER_XYWH 2. XYWH 3. XYXY 4. REL_XYXY 5. REL_XYWH 6. YXYX 7. REL_YXYX.

As it appears bounding boxes are not in correct places. The best likelihood is with REL_YXYX because

bbox- tf.Tensor of type tf.float32 and shape [4,] which contains the normalized coordinates of the bounding box [ymin, xmin, ymax, xmax]

test_image_incorrect

Expected Behavior:

The bounding boxes are correctly calculated, e.g. using this custom conversion:

from keras_cv.src.backend import ops
from keras_cv.src.bounding_box.converters import _image_shape, ALL_AXES

def convert_kitty_yxyx(boxes, images=None, image_shape=None):
    image_height, image_width = _image_shape([images], image_shape, boxes)
    ymin, xmin, ymax, xmax = ops.split(boxes, ALL_AXES, axis=-1)

    xmin, xmax = xmin * image_width, xmax * image_width
    ymin, ymax = (1-ymin) * image_height, (1-ymax) * image_height

    # the order is very strange, but otherwise it will 
    # break in area calculations in keras_cv.src.bounding_box.utils.clip_to_image() 
    # if augmentation will be used
    return ops.concatenate([xmin, ymax, xmax, ymin ], axis=-1)

test_image_correct

Steps To Reproduce:

Google Colab

Hi @CatUnderTheLeaf

Thanks for reporting the issue. I have tested the code snippet and reproduces the reported behaviour. Attached gist file for reference.

We will look into the issue and update you the same.

@CatUnderTheLeaf looks like the bbox format is none of the standard ones specified in kerasCV. from what I am seeing in the custom implementation of yours is that the format of the bounding box is in this format rel_yxyx but the values of Y needs to be subtracted by 1.
Here is a quick test. The bounding box needs to be pre processed to correct rel_yxyx format(y values need to be subtracted from 1) before being passed to the visualization function.
image