NVlabs/Deep_Object_Pose

Understanding the output of nvisii_data_gen

Closed this issue · 4 comments

I have generated some images with the help of nvisii_data_gen for the YCB cracker box. I've plotted the ground truth cuboid corners on the respective images.

  1. In this image, there are quite a few corners seen on the shoe. I assume these belong to cracker boxes that get partially occluded by the shoe. Is that correct? Does DOPE ignore such corners during training? The "visibility" in the json file is set to 1. What does this mean in this context?
    image

  2. This one is a bit more tricky. It seems a cracker box got completely occluded by another one closer to the camera (bottom right). Again, does DOPE ignore the corners belonging to the occluded box? Again, the "visibility" in the json file is set to 1. What does this mean for this situation?
    image

  3. What could the single red corner at the bottom left in this image be? There seems to be no cracker box behind the green thing (we should have seen more corners in that case), and there's also no crackerbox coming out from the green thing in the downward direction.
    image

Yeah you need to add the flag to compute the visibility using segmentation masks. https://github.com/NVlabs/Deep_Object_Pose/blob/master/scripts/nvisii_data_gen/single_video_pybullet.py#L136-L144

But from what I rememeber you should get 0 or 1 for visibility. There might be a new bug.

We are currently upgrading the repo to use blenderproc as well.

Thanks @TontonTremblay
I am now calculating visibility using the --visibility-fraction argument.

As you pointed out above, visibility<1 for occluded objects and ==1 for fully visible objects (even if a part of the object gets cut by the frame border).

I'm just wondering about how the visibility value is used while training. This line loads the cuboid keypoints if visibility>0

if obj['visibility'] > 0:

Is it expected that the network learns where the occluded corners are from the non-occluded ones?

yeah in the training script if it is visible then it will use all the points as signal. You can fiddle with the value, like 0.2 if you want to have something that only gets a signal pass that point. If the value is 0 above no heat map is generate to train against.

Ok thanks for the reply @TontonTremblay