Misalignment in unprojected RGBD images

Question

Misalignment in unprojected RGBD images

ayushjain1144 opened this issue a year ago · 5 comments

Thank you for your excellent work, and especially in preprocessing the data into a common format!

I am using your 2D pre-processed matterport data and I am unprojecting it to 3D using the provided intrinsics and extrinsics. It gave me the following pointcloud, which does not look good I think:

wandb link: https://wandb.ai/ayushjain1144/m3d/runs/kdbjp0xg?workspace=user-ayushjain1144

I did some further analysis and found that all images unprotected from same camera id are well aligned, but there seems to be some translation misalignment when combining images from different cameras. For eg: in this wandb link, unprojected is the pointcloud from same camera uiud, unprojected_all is combined pointcloud from all these cameras.

I am fairly certain that my unprojection code is ok, because it works well on scannet. Do you have any ideas on what might be wrong? Do the visualizations look ok on your side?

Thank you!

Answer 1 · 2023-11-02T11:39:21.000Z

Hi @ayushjain1144 thanks for your interest in our work! I am not sure how exactly you do your unprojection from 2D to 3D, but there might be some differences between ScanNet and Matterport3D camera poses. I am not sure what the problem is, but can you try one thing:

pose[:3, 1] *= -1.0
pose[:3, 2] *= -1.0

Maybe this might make a difference. If not, maybe you should check if there are some differences in Matterport3D and ScanNet poses.

Best
Songyou

Answer 2 · 2023-11-03T01:04:24.000Z

Hi,

thank you for your reply. I am not using raw matterport data but your processed data where I think you already did that processing over the poses. I also tried doing this processing again over your data, which just makes it into the original poses, and the misalignment is still there.

My unprojection code is pretty standard I think and has worked for several other datasets too. does unprotected pointclouds on your side looked ok?

def unproject(intrinsics, poses, depths, mask_valid=True):
    """
    Inputs:
        intrinsics: B X V X 3 X 3
        poses: B X V X 4 X 4 (torch.tensor)
        depths: B X V X H X W (torch.tensor)
    
    Outputs:
        world_coords: B X V X H X W X 3 (all valid 3D points)
        valid: B X V X H X W (bool to indicate valid points)
                can be used to index into RGB images
                to get N X 3 valid RGB values
    """
    B, V, H, W = depths.shape
    fx, fy, px, py = intrinsics[..., 0, 0][..., None], intrinsics[..., 1, 1][..., None], intrinsics[..., 0, 2][..., None], intrinsics[..., 1, 2][..., None]

    y = torch.arange(0, H).to(depths.device)
    x = torch.arange(0, W).to(depths.device)
    y, x = torch.meshgrid(y, x)

    x = x[None, None].repeat(B, V, 1, 1).flatten(2)
    y = y[None, None].repeat(B, V, 1, 1).flatten(2)
    z = depths.flatten(2)
    x = (x - px) * z / fx
    y = (y - py) * z / fy
    cam_coords = torch.stack([
        x, y, z, torch.ones_like(x)
    ], -1)

    world_coords = (poses @ cam_coords.permute(0, 1, 3, 2)).permute(0, 1, 3, 2)
    world_coords = world_coords[..., :3] / world_coords[..., 3][..., None]

    world_coords = world_coords.reshape(B, V, H, W, 3)

    if mask_valid:
        world_coords[depths == 0] = -10
    return world_coords

Answer 3 · 2023-11-03T06:30:29.000Z

I wonder if you got to try to apply the sign on the translation as I suggested in the last message? I just wanna see if there is something wrong with my provided poses.

…

On Fri, Nov 3, 2023, 2:04 AM Ayush Jain ***@***.***> wrote: Hi, thank you for your reply. I am not using raw matterport data but your processed data where I think you already did that processing over the poses. I also tried doing this processing again over your data, which just makes it into the original poses, and the misalignment is still there. My unprojection code is pretty standard I think and has worked for several other datasets too. does unprotected pointclouds on your side looked ok? def unproject(intrinsics, poses, depths, mask_valid=True): """ Inputs: intrinsics: B X V X 3 X 3 poses: B X V X 4 X 4 (torch.tensor) depths: B X V X H X W (torch.tensor) Outputs: world_coords: B X V X H X W X 3 (all valid 3D points) valid: B X V X H X W (bool to indicate valid points) can be used to index into RGB images to get N X 3 valid RGB values """ B, V, H, W = depths.shape fx, fy, px, py = intrinsics[..., 0, 0][..., None], intrinsics[..., 1, 1][..., None], intrinsics[..., 0, 2][..., None], intrinsics[..., 1, 2][..., None] y = torch.arange(0, H).to(depths.device) x = torch.arange(0, W).to(depths.device) y, x = torch.meshgrid(y, x) x = x[None, None].repeat(B, V, 1, 1).flatten(2) y = y[None, None].repeat(B, V, 1, 1).flatten(2) z = depths.flatten(2) x = (x - px) * z / fx y = (y - py) * z / fy cam_coords = torch.stack([ x, y, z, torch.ones_like(x) ], -1) world_coords = (poses @ cam_coords.permute(0, 1, 3, 2)).permute(0, 1, 3, 2) world_coords = world_coords[..., :3] / world_coords[..., 3][..., None] world_coords = world_coords.reshape(B, V, H, W, 3) if mask_valid: world_coords[depths == 0] = -10 return world_coords — Reply to this email directly, view it on GitHub <#60 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADNJ3ITMYMNQFCBCSKLMI7LYCQ7KJAVCNFSM6AAAAAA62MBP2SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOJRG43DGMZTGA> . You are receiving this because you modified the open/close state.Message ID: ***@***.***>

Answer 4 · 2023-11-03T06:38:59.000Z

I did try it and it makes things worse very weirdly:

To be precise, this is what I did:

poses = torch.from_numpy(np.array(poses)).float().cuda()
poses[..., :3, 1] *= -1.0
poses[..., :3, 2] *= -1.0

earlier the images were aligned per camera uiud, but multiplying with -1 breaks that too. here is also a wandb link to visualizations: https://wandb.ai/ayushjain1144/m3d/runs/nvq21cwp?workspace=user-ayushjain1144

Thank you Songyou for your reply and continued help!

Answer 5 · 2023-11-03T12:14:11.000Z

I did try it and it makes things worse very weirdly:
To be precise, this is what I did:
poses = torch.from_numpy(np.array(poses)).float().cuda()
poses[..., :3, 1] *= -1.0
poses[..., :3, 2] *= -1.0
earlier the images were aligned per camera uiud, but multiplying with -1 breaks that too. here is also a wandb link to visualizations: https://wandb.ai/ayushjain1144/m3d/runs/nvq21cwp?workspace=user-ayushjain1144

Thank you Songyou for your reply and continued help!

That's quite strange... Can you first try to run our feature fusion code, see whether it is correct. We project from 3D to 2D to obtain the per-point features. If it works, maybe you can check if you can adapt your unprojection from 2D to 3D accordingly?

Best
Songyou