Incorrect handling of truncated bounding boxes
Opened this issue · 3 comments
Hey, thanks for your tool! It's pretty useful. One thing I noticed though -- it only keeps labels if the following conditions hold:
- the center is in front of the camera
- the center is within
max_d
meters of the camera - the 2d bounding box corners are both within the image
I believe this is not the desired behavior for a KITTI-fied dataset, because even when the projected 2d-from-3d boxes are partially outside the frame (i.e. truncated), we still want to use the truncated versions as supervision. You can see in this video that KITTI keeps these boxes. I also believe this is standard practice in object recognition.
I think the right behavior is to keep the boxes if any corner is within the image frame. However, getting the 2D bbox in this case is more complicated. You probably want to find where the edges of the 3D bbox (projected to 2D) intersect the image border, and create a point set from the union of 1) the intersections of the 3D bbox edges with the image boundaries and 2) the remaining in-frame vertices of the 3D bbox, then take the min_x, min_y, max_x, max_y of that point set to get your 2D bbox.
argoverse-kitti-adapter/adapter.py
Line 196 in 019eafa
Hi @seanremy,
Have you fixed this?
Hey @harishrithish7, I stopped using this repository for my work. However you may find this interesting: argoverse/argoverse-api#159
That way you can do all of your work in the native argoverse format, which should be a more sustainable approach. PR should be merged soon, we're just working out some final documentation/unit test stuff, so stay tuned!
Hey @seanremy, thanks for the quick response. Yes, it does seem promising! Waiting for the merge.