data['camera_to_aligned_camera'],planes_gt coordinate transformation and plane distance loss

hey@ymingxie, Thanks for your work! I have some questions to ask you for,Looking forward to your answer!

about data['camera_to_aligned_camera'], I find neuralrecon do not have this transformation. however planarrecon save it and use it for plane_gt, I can not understand this code:

PlanarRecon/datasets/transforms.py

Lines 152 to 165 in 00bd50c

    
           # world to camera 
        
           planes_gt = data['planes'] 
        
           planes_gt = (data['middle_pose'].transpose(0, 
        
                                                   1) @ planes_gt.transpose( 
        
               0, 1)).transpose(0, 1) 
        
           # reduce 2*pi to pi 
        
           inverse_id = torch.nonzero(planes_gt[:, 2] > 0, as_tuple=False).squeeze(1) 
        
           planes_gt[inverse_id] = -planes_gt[inverse_id] 
        
           # camera to aligned camera 
        
           planes_gt = (torch.inverse(data['camera_to_aligned_camera']).transpose(0, 
        
                                                                               1) @ planes_gt.transpose( 
        
               0, 1)).transpose(0, 1) 
        
           data['planes_trans'] = planes_gt

about plane distance loss, I find plane_gt transform by data['camera_to_aligned_camera'], but r_coords transform by data['world_to_aligned_camera'], Is there a problem with the calculated point and plane loss?

PlanarRecon/models/planarrecon_network.py

Lines 248 to 251 in 00bd50c

    
           loss = self.compute_loss(occ, class_logits, residuals, distance, off_center, occ_target, label_target, 
        
                                   anchors_gt, residual_gt, planes_gt, mean_xyz_gt, r_coords, 
        
                                   mask=grid_mask, 
        
                                   pos_weight=self.cfg.POS_WEIGHT)

Thanks!

Hi, this matrix is to convert the plane parameters from camera coordinate to a gravity-aligned coordinate :) The matrix is generated here: https://github.com/neu-vi/PlanarRecon/blob/main/datasets/transforms.py#L65. Because in ScanNet world coordinate the inverse z axis is gravity
direction, we can generate the gravity-aligned coordinate by using the transformation from camera to world coordinate. NeuralRecon also has this transformation (https://github.com/zju3dv/NeuralRecon/blob/master/datasets/transforms.py#L79)

The plane gt and r_coords are both in the aligned_camera (gravity-aligned) coordinate. It should be OK.

Feel free to reach out to me if you have more questions :)

Thanks! I got it!

	# world to camera
	planes_gt = data['planes']
	planes_gt = (data['middle_pose'].transpose(0,
	1) @ planes_gt.transpose(
	0, 1)).transpose(0, 1)
	# reduce 2*pi to pi
	inverse_id = torch.nonzero(planes_gt[:, 2] > 0, as_tuple=False).squeeze(1)
	planes_gt[inverse_id] = -planes_gt[inverse_id]
	# camera to aligned camera
	planes_gt = (torch.inverse(data['camera_to_aligned_camera']).transpose(0,
	1) @ planes_gt.transpose(
	0, 1)).transpose(0, 1)

	data['planes_trans'] = planes_gt

	loss = self.compute_loss(occ, class_logits, residuals, distance, off_center, occ_target, label_target,
	anchors_gt, residual_gt, planes_gt, mean_xyz_gt, r_coords,
	mask=grid_mask,
	pos_weight=self.cfg.POS_WEIGHT)