Questions about the paper

Question

Questions about the paper

Closed this issue 7 months ago · 3 comments

Thanks for your great work!

I am reading paper. In section 3.3 said " To recover the re�maining 2 DoFs, scale s and in-plane rotation α, we train deep networks to directly regress these values from a single 2D-2D correspondence. Since the feature extractor Fae is invariant to in-plane rotation and scaling, the corresponding features cannot be used to regress those values, hence we have to train another feature extractor we call Fist".

why? A simple Image regestring method based on SIFT descriptors, and SIFT descriptors is invariant to in-plane rotation and scaling.

Answer 1 · 2024-02-07T03:15:59.000Z

I see. You use feature to regress scale s and in-plane rotation α.
another question, why not use correspondences and RANSAC to get scale s and in-plane rotation α, translation t like a simple image regestring method based on SIFT descriptors, Since you have got the the correspondences from Fae ?

Answer 2 · 2024-02-07T09:18:06.000Z

Thanks for your interest!

We show in Table 4 the ablation study with different ways to predict the scale, and in-plane rotation from multiple correspondences as you mentioned (n=2 or n=4). Our method predicts fully 6D pose from a single correspondence (n=1) which is different and outperforms other approaches.

Answer 3 · 2024-02-10T09:57:59.000Z

I closed the issue but feel free to re-open it again if you have additional questions!