Can you elaborate the intuition behind "parse_poses" and "get_root_relative_poses" functions in parse_poses.py?

Question

Can you elaborate the intuition behind "parse_poses" and "get_root_relative_poses" functions in parse_poses.py?

nessessence opened this issue 3 years ago · 2 comments

Can you elaborate the intuition behind "parse_poses" and "get_root_relative_poses" functions in parse_poses.py ?
For example, why do we need to "read all pose coordinates at neck location" and "refine keypoints coordinates at corresponding limbs locations"

and also why "features" (inference_results[0]) has different shape sometime. ( for example feature.shape = (57, 32, 12) or
(57, 32, 9) or (57, 32, X), but most of them are (57, 32, 12)

Thanks in advance.

Answer 1 · 2021-08-03T18:59:56.000Z

Hi! You can find the details in the paper, in short this adds robustness to pose prediction: neck is usually visible, so it is ok to encode other keypoints coordinates at the neck location. If other keypoints are also visible, then their coordinates can be substituted by coordiantes at their own location. Regarding the second question, I believe shapes should be always the same, possibly there is a bug somewhere.

Answer 2 · 2021-08-10T17:15:36.000Z

Hope, it helped.