akanazawa/hmr

3D position

A7ocin opened this issue · 8 comments

Hello @akanazawa and thank you for releasing the code for the paper.
I was trying to figure out how to get the 3D distance between the camera and the predicted 3D joints. Is there a way to do that?

As for now, I've understood that HMR is object-centric, that's why the mesh is always positioned at (0,0,0) in the 3D world. Another thing I've seen is that the 3D skeleton is flipped, but a solution to that is mentioned in another issue.

The final step for me is to understand how to retrieve the 3D (x,y,z) of the mesh with respect to the camera. Is that possible? Maybe using the axis-angle 24 joints instead of the 19 ones?

Thank you so much

Hi @A7ocin,

So you can approximate this using similar triangles (recall the Hartley and Zisserman book).
The x, y location is already given from the bbox coordinates (for orthogonal camera)
the question is z. The neat thing is that SMPL comes with an explicit height, so the meshes recovered has a height in real unit. You also know the height of the person on the image plane.
Then you can make an approximation to the focal length, this gives you all the components needed to approximate z.
This was used to estimate the depth in our SMPLify paper (see the camera initialization code) as well as SfV. It's not perfect but is a reasonable approximation.

Ok, everything clear
Thank you

Hello @A7ocin,
I am also studying this article on hmr recently. I also have question about how to acquire the distance between the camera and the predicted 3D joints. I think it is meaningful. So have you acquire it through the method that the author recommended? Can you get depth information of each joints? I am looking forward to your reply. thank you very much

Hi @zhangkai95
thank you for your interest. Yes, I managed to get the information you need about the real world joint position. I used the author's suggestion to solve my problem. If you want to, we can talk about the solution via email.

Yes @A7ocin
I really want to discuss this with you on this issue.Can you tell me your email account? Thank you very much.

Sure, you can find it in my GitHub home

The x, y location is already given from the bbox coordinates

Can I know what's the output 'joint_3d' means? Does it the position of all joints?

I believe 'joint_3d' stored in the tfrecord is the 3D ground truth joints given by Human3.6M. They are not SMPL's 3D joints, as H36M and SMPL have slightly different skeleton definition. This field is not relevant to data without ground truth 3D joints.

Best,

A