j96w/6-PACK

One network for multiobject training

Closed this issue · 1 comments

Hi, thank you for making your code publish. I am familiar with your previous work (DenseFusion) which is also a great asses to 3D pose estimation.

I have a question about 6-PACK.

Can 6-PACK train one network for multiple categories pose estimation or one category needs one network?

j96w commented

Hi, thanks for mention this good question. I think the "one-for-all" idea is possible but I'm not sure how it will affect the results. To realize it, we only need to branch the output layer of the keypoint network of 6-PACK into different categories as DenseFusion did. I didn't do that because I think there are some theoretical problems. Since 6-PACK is trying to track 6d pose based on 3d keypoints detection which is different from the pose regression in DenseFusion, the idea of "one-for-all" is assuming that the keypoints of different categories are sharing some common latent feature. It might be true, but when thinking of the big geometry difference between these categories, I don't think it is proper to have this assumption. Also, another difference between 6-PACK and DenseFusion is that there is no segmentation model in 6-PACK, which means when two objects are very close to each other, the "one-for-all" model might easily switch to track the other object instead of the target one we set at the beginning of the tracking. Based on these reasons, I didn't do "one-for-all" in this work and only "one-for-one". But I admit it's worth a try.