Query regarding use of *world_to_aligned_camera* transformation
Closed this issue · 2 comments
Hi @ymingxie
Thanks for sharing the great work!
Could you please help me understand the use of world_to_aligned_camera
transformation:
PlanarRecon/datasets/transforms.py
Line 80 in af1eb5f
-
Whats the motivation of using the middle camera pose ? And what's the need of xy plane alignment as defined in
rotate_view_to_align_xyplane
?PlanarRecon/datasets/transforms.py
Line 65 in af1eb5f
-
As per the paper, I thought all the fusion is done at the world coordinate system. Why is then the sparse conv 3D backbone created at the aligned camera (middle camera pose) coordinate system ?
PlanarRecon/models/planarrecon_network.py
Line 219 in af1eb5f
Looking forward to hearing from you soon.
Thanks & Best Regards
Shivam
Hi Shivam,
Thanks for your interest in our work!
- It would be better to predict planes/geometry in a local coordinate for each fragment. I choose the middle camera coordinate as the local coordinate.
- The rotate_view_to_align_xyplane is used to create a gravity-aligned coordinate based on the local coordinate (middle camera coordinate). Most planes are parallel or perpendicular to the gravity direction. I leverage this prior and predict the planes/geometry in this gravity-aligned coordinate.
- The gru fusion (following NeuralRecon) and sparse convolution are done in gravity-aligned coordinates. Before the gru fusion, the global hidden state will be transformed to local gravity-aligned coordinates. After the gru fusion, the updated global hidden state will be transformed back to world coordinate.
Thanks a lot @ymingxie this helped, its clear now!