Query regarding use of world_to_aligned_camera transformation

Thanks for sharing the great work!

Could you please help me understand the use of world_to_aligned_camera transformation:

Line 80 in af1eb5f

    
           data['world_to_aligned_camera'] = torch.from_numpy(rotation_matrix4x4).float() @ middle_pose.inverse()

Looking forward to hearing from you soon.

Thanks & Best Regards
Shivam

Hi Shivam,

Thanks for your interest in our work!

It would be better to predict planes/geometry in a local coordinate for each fragment. I choose the middle camera coordinate as the local coordinate.
The rotate_view_to_align_xyplane is used to create a gravity-aligned coordinate based on the local coordinate (middle camera coordinate). Most planes are parallel or perpendicular to the gravity direction. I leverage this prior and predict the planes/geometry in this gravity-aligned coordinate.
The gru fusion (following NeuralRecon) and sparse convolution are done in gravity-aligned coordinates. Before the gru fusion, the global hidden state will be transformed to local gravity-aligned coordinates. After the gru fusion, the updated global hidden state will be transformed back to world coordinate.

Thanks a lot @ymingxie this helped, its clear now!