Question about training the hand pose.

Question

Question about training the hand pose.

zhengsipeng opened this issue 2 years ago · 3 comments

Hi, thanks a lot for your great work.

Recently I'm working on a self-supervised model where I plan to use frankmpcap to extract hand pose as the pseudo label instead of using GT.
According to Eq(5) in your paper, the hand module loss is L=L_{theta}+L_{3D}+L_{2D}+L_{reg}. So if I want to use the same criterion as Eq(5) in my work, I assume I need to use the prediction of your hand module for supervision accordlingly like:
48-dim hand pose for L_{theta}'s label (pred_hand_pose);
10-dim for L_{reg}'s label (pred_hand_betas)
21x2 dim for L_{2D}'s label (pred_joints_img[:, :2])

But which output can I use for L_{3D}'s supervision? pred_joints_smpl or others? I notice that your hand module is 3D joint in smplx space -> 2D bbox -> 2D image, no 3D joints in image space are predicted.

Answer 1 · 2022-10-06T15:24:29.000Z

@zhengsipeng You can use pred_joints_smpl to calculate L_{3D}.

Answer 2 · 2022-10-07T14:32:48.000Z

@zhengsipeng You can use pred_joints_smpl to calculate L_{3D}.

Thanks for your reply.
So I guess I can also use pred_hand_pose for L_{pose} and pred_joints_img for L_{2D}, am I right?

Answer 3 · 2022-10-07T14:35:38.000Z

@zhengsipeng You can use pred_joints_smpl to calculate L_{3D}.

Thanks for your reply. So I guess I can also use pred_hand_pose for L_{pose} and pred_joints_img for L_{2D}, am I right?

Yes.