How can I get a camera pose for the input image?

Question

How can I get a camera pose for the input image?

AlbertHuyb opened this issue 8 months ago · 1 comments

Thanks for the nice work!

I want to get the camera pose for the input image. After I read some of the code, I noticed that you utilized the following code in provider.py to render image exactly at the input camera pose (here):

        if not self.training:
            if is_face or can_pose:
                mvp = projection @ torch.inverse(poses.cpu()).to(self.device)
            else:
                mvp = torch.inverse(poses.cpu()).to(self.device)
                mvp[0, 2, 3] = 0.
                TO_WORLD = np.eye(
                    4,
                    dtype=np.float32,
                )
                TO_WORLD[2,2] = -1
                TO_WORLD[1,1] = -1
                TO_WORLD = mvp.new_tensor(TO_WORLD)
                mvp = TO_WORLD @ mvp

I wonder why these code lines work, because the resulting mvp matrix does not consider the projection matrix. And I don't understand why we should set mvp[0, 2, 3] to zero.

Moreover, I'd like to know how could I compute a camera pose pose_gt such that the corresponding mvp matrix could computed by

mvp = projection @ torch.inverse(pose_gt).to(self.device)

Looking forward to your response. Thanks!

Answer 1 · 2024-05-30T02:37:45.000Z

Hi, when the camera params are unknown, we use the same perspective camera as ICON, so this part is hard-coded. For evaluation on THuman2.0 and CAPE, we use ground-truth camera matrices. So if you have your own camera parameters, you can also use them in the code.