Can you elaborate the intuition behind "parse_poses" and "get_root_relative_poses" functions in parse_poses.py?
nessessence opened this issue · 2 comments
Can you elaborate the intuition behind "parse_poses" and "get_root_relative_poses" functions in parse_poses.py ?
For example, why do we need to "read all pose coordinates at neck location" and "refine keypoints coordinates at corresponding limbs locations"
and also why "features" (inference_results[0]) has different shape sometime. ( for example feature.shape = (57, 32, 12) or
(57, 32, 9) or (57, 32, X), but most of them are (57, 32, 12)
Thanks in advance.
Hi! You can find the details in the paper, in short this adds robustness to pose prediction: neck is usually visible, so it is ok to encode other keypoints coordinates at the neck location. If other keypoints are also visible, then their coordinates can be substituted by coordiantes at their own location. Regarding the second question, I believe shapes should be always the same, possibly there is a bug somewhere.
Hope, it helped.