nutonomy/nuscenes-devkit

Possible issues with annotations

vatsal-shah opened this issue · 4 comments

It seems like there are some issues with ground truth annotations in some frames. I do not have an exhaustive list, but I've mainly been experimenting with the val set, and here are some sample tokens with annotation issues. The main issue is some objects (I'm only experimenting with Cars) are not annotated in the ground truth.

0d9c4c2de24b49758901191f623d426b
0ed1a404c1fb4f7a87a30d3ee45f7f97
139bce92199440ea8929d1a1bd10dbda
224d34c137b64e4f8012c7280b4b9089
3abf81a7c3894000a4c508e6ced0caca
4b5202b4625f48799812af5d212e68a4
4e56a7a63b984597844eb55df9a2ba21
74109c3e72b24fb48e2262dc869ba868
8d265c91cc944ba790c09e74d2811d08
9827d52b3aa2484c8901f67f89742e15
f868542113014aeab862aa47e088b1ec
f91ec82037fb47ccbac160cb5de453bf
f9438d42bb944364b5a75d6c5d0bc758
fbbad6309f1543f78634e49c50dfb779

Is there something I'm missing?
Here are some sample images, with the ground truth annotations (Only Cars) visualized :

False positive annotations
000309

Missing Car annotations
000329
000811
001721
005835

Hi! Can I know if this is being looked at?
The following code can be run to visualize some frames with errors:

from nuscenes.nuscenes import NuScenes
nusc = NuScenes(version='v1.0-trainval', dataroot=/path/to/nuscenes,verbose=True)
sensor = 'CAM_FRONT'

tokens = ['0d9c4c2de24b49758901191f623d426b','0ed1a404c1fb4f7a87a30d3ee45f7f97','139bce92199440ea8929d1a1bd10dbda','224d34c137b64e4f8012c7280b4b9089','3abf81a7c3894000a4c508e6ced0caca','4b5202b4625f48799812af5d212e68a4','4e56a7a63b984597844eb55df9a2ba21','74109c3e72b24fb48e2262dc869ba868','8d265c91cc944ba790c09e74d2811d08','9827d52b3aa2484c8901f67f89742e15','f9438d42bb944364b5a75d6c5d0bc758','fbbad6309f1543f78634e49c50dfb779']

for my_sample_token in tokens:
    print(my_sample_token)
    my_sample = nusc.get('sample', my_sample_token)
    cam_front_data = nusc.get('sample_data', my_sample['data'][sensor])
    nusc.render_sample_data(cam_front_data['token'], out_path=/path/to/out_file.png)
  • False positive annotations: These are difficult to track, as the object may still be there, it may just be temporarily occluded. Therefore we discard all boxes without a lidar or radar return inside them for the detection and tracking challenges. Also note that the lidar is significantly higher than the cameras and may actually be able to see the vehicles on top of the hill.
  • Missing Car annotations: As noted above, we only annotate objects with at least 1 radar or lidar return. Many of your examples are very borderline:
    • download Likely doesn't have any returns, although it's a close call.
    • download (1) has 0 points on the red car.
    • download (2) Annotated more cars then necessary. Only 1 has a lidar point.
    • download (6) The white car may have 2 lidar points.
    • download (7) has no lidar points on the blue car.

As you can see, it is often a very difficult call to make whether these objects should be annotated or not. A big part of the problem of course is the parallax - the lidar view is typically more informative than the camera. Other datasets released since then have decided to only annotate objects with >=5 lidar points. That of course makes it easier to decide, but also means the really hard objects aren't part of the dataset.

I would be very interested in how you found these. Did you run an object detector and compared it to the ground-truth?

Thank you for such a detailed response! It gave me a deep insight into the capture and annotation process.
Yes, I ran an object detector trained on the KITTI dataset and tried to test it on the nuScenes val set. The detector had an AP > 85% on the KITTI val set, but was only able to achieve an AP ~ 18% on the nuScenes val set, which led me to this analysis. Hope this helps!

Thanks for the analysis. Our dataset is definitely much harder. Especially on parking lots we try to label every vehicle with a lidar/radar point, which you don't see in e.g. KITTI.