TRI-ML/DDAD

generate scene point cloud from depth view images

anh1nt opened this issue · 3 comments

It seem the cammeras are rotated in yaw, pitch, roll; if I would generate world pointcloud from depth image, I need to rotate the depth image first in terms because of cammera rotations, and then do inverse interm of intrinsic and extrinsic multiplication, is it correct? if it is, then what is the rotation should I use for rotating depth images?
many thanks!!

I think maybe @ivasiljevic can help with this as he worked on similar things on DDAD.

Hi @anh1nt,

not sure if you still need it but here's what I do. First, I load a scene with dgp (https://github.com/TRI-ML/dgp) as a SynchronizedSceneDataset object. Then, when you iterate over it, you can get the intrinsics as sample['intrinsics'] and the pose as sample['pose']. Since the pose is given as translation and quaternion, you can get the corresponding rotation matrix as follows (https://en.wikipedia.org/wiki/Quaternions_and_spatial_rotation):

def qvec2rotmat(qvec):
    return np.array([
        [1 - 2 * qvec[2]**2 - 2 * qvec[3]**2,
         2 * qvec[1] * qvec[2] - 2 * qvec[0] * qvec[3],
         2 * qvec[3] * qvec[1] + 2 * qvec[0] * qvec[2]],
        [2 * qvec[1] * qvec[2] + 2 * qvec[0] * qvec[3],
         1 - 2 * qvec[1]**2 - 2 * qvec[3]**2,
         2 * qvec[2] * qvec[3] - 2 * qvec[0] * qvec[1]],
        [2 * qvec[3] * qvec[1] - 2 * qvec[0] * qvec[2],
         2 * qvec[2] * qvec[3] + 2 * qvec[0] * qvec[1],
         1 - 2 * qvec[1]**2 - 2 * qvec[2]**2]])

At this point, just project each pixel with inverse intrinsics and align all the points into a common reference frame. Hope it helps!

Thank you so much, I also made a projection for cam views into ego view. just need to inverse the translation and rotatation of cam cloud points using P = K[R|t] matrix. It is working very good