YudongGuo/AD-NeRF

Order and Scale of the Transformation Matrix

Opened this issue · 2 comments

Hi Yudong,

Thanks for the amazing work! I noticed that in the process_data.py file, you have the following manipulation of rotation and transformation matrix:

    trans = params_dict['trans'] / 10.0
    valid_num = euler_angle.shape[0]
    train_val_split = int(valid_num*10/11)
    train_ids = torch.arange(0, train_val_split)
    val_ids = torch.arange(train_val_split, valid_num)
    rot = euler2rot(euler_angle)
    rot_inv = rot.permute(0, 2, 1)
    trans_inv = -torch.bmm(rot_inv, trans.unsqueeze(2))
    pose = torch.eye(4, dtype=torch.float32)

I am wondering why you

  1. downscale the translation vector by 10 trans = params_dict['trans'] / 10.0,
  2. apply permutation on rotation matrix rot_inv = rot.permute(0, 2, 1),
  3. rotate the translation vector trans_inv = -torch.bmm(rot_inv, trans.unsqueeze(2)),
  4. flip the sign of translation vector trans_inv = -torch.bmm(rot_inv, trans.unsqueeze(2))

Looking forward to hearing from you!

Thanks,
Jeremy

Hi,

  1. In the face tracking process, the camera space is measured in decimetre, and we convert it to meter by downscaling.
  2. The transformation matrix (rotation and translation) generated in tracking process is a 'canonical space to camera space' transformation. In NeRF, we need the 'camera space to canonical space' transformation, so we do inverse transformation (just 2-4).

Thanks a lot for your reply! For the first question, why do you want to convert the unit to meter? What is the unit in NeRF?