NVlabs/Deep_Object_Pose

Results bounding box shifted for custom model

Closed this issue · 4 comments

Hello,

when I use models provided with the repo (MobileNet) I receive:
000000_belief
000000

But when I use a custom model (Resnet-based), I receive:

resnet_000000_belief
resnet_000000

The belief maps look very similar, but the cuboid is shifted to the right-bottom. Any idea what's happening here?

A few things could be going on here. Can you check the raw 2d points locations -- check the array values? Are they different? Are you using different intrinsics? or cuboid size?

But I am surprised by the shift as well, it should not be this big. Are you using train2 vs train?

I could solve the issue. I found multiple things that could have impacted the results:

  1. You can see that also the first (MobileNet) mobile has a small shift. This one could be removed by setting this variable in https://github.com/NVlabs/Deep_Object_Pose/blob/master/scripts/train2/inference/detector.py#L607 to 0:
    OFFSET_DUE_TO_UPSAMPLING = 0 #0.4395
    Could you explain its purpose and where its definition is coming from?

  2. For the ResNet model, the heatmap size is 52, instead of 50. I think this leads to incorrect upscaling. In your code the default output size is 400. The scale factor is 8 in https://github.com/NVlabs/Deep_Object_Pose/blob/master/scripts/train2/inference/detector.py#L550, which makes perfectly sense to me, because 400/50=8. But when I adapt the scale factor to a heatmap of 52 I get incorrect results. How is the upscaling actually working?

  3. For ground truth data, I use the Isaac Sim replicator from NVIDIA. It comes with the DOPEWriter that produces matching ground truths for training. But I found that the provided projected centroid isn't actually the object's centroid, but the center point of the object's bottom plane.

  1. yeah there was a bug in the first version that I manually added this upscalling thing, I dont remember everything there.
  2. humm how come it is 52, I think these 2 pixels are shifting the results possibly. The upscale it is px*8 and then pnp is ran on that.

Did you check the raw 2d keypoints position one vs the other, without running pnp, so we can start isolating the problem.

The shifting was cause due to a wrong scale factor, as I changed the heat map size.
Thank you for your time.