yzhou377/argoverse-kitti-adapter

Incorrect handling of truncated bounding boxes

Opened this issue · 3 comments

Hey, thanks for your tool! It's pretty useful. One thing I noticed though -- it only keeps labels if the following conditions hold:

  1. the center is in front of the camera
  2. the center is within max_d meters of the camera
  3. the 2d bounding box corners are both within the image
    I believe this is not the desired behavior for a KITTI-fied dataset, because even when the projected 2d-from-3d boxes are partially outside the frame (i.e. truncated), we still want to use the truncated versions as supervision. You can see in this video that KITTI keeps these boxes. I also believe this is standard practice in object recognition.

I think the right behavior is to keep the boxes if any corner is within the image frame. However, getting the 2D bbox in this case is more complicated. You probably want to find where the edges of the 3D bbox (projected to 2D) intersect the image border, and create a point set from the union of 1) the intersections of the 3D bbox edges with the image boundaries and 2) the remaining in-frame vertices of the 3D bbox, then take the min_x, min_y, max_x, max_y of that point set to get your 2D bbox.

if 0<center_cam_frame[0][2]<max_d and 0<image_bbox[0]<1920 and 0<image_bbox[1]<1200 and 0<image_bbox[2]<1920 and 0<image_bbox[3]<1200:

Hi @seanremy,
Have you fixed this?

Hey @harishrithish7, I stopped using this repository for my work. However you may find this interesting: argoverse/argoverse-api#159

That way you can do all of your work in the native argoverse format, which should be a more sustainable approach. PR should be merged soon, we're just working out some final documentation/unit test stuff, so stay tuned!

Hey @seanremy, thanks for the quick response. Yes, it does seem promising! Waiting for the merge.