dluvizon/scene-aware-3d-multi-human

real-time

Opened this issue · 2 comments

When I came across your article, I felt very excited. Seeing that your article is based on a post-processing optimization approach, I wonder if it could be modified to enable real-time prediction of human SMPL models in camera space absolute position, perhaps through methods like TensorRT? I look forward to your answer.

Hi @xiaoyudanaa , thanks for your interest. Definitely, the test-time optimization can be improved. At the moment, the method relies on a stochastic gradient descent with the Adam optimizer. I guess it could be speed up with a quasi-newtonian optimizer instead.
However, you have to account that the method takes as input predictions from off-the-shelf models as preprocessed data and this could be a bottle-neck.

Hi @xiaoyudanaa , thanks for your interest. Definitely, the test-time optimization can be improved. At the moment, the method relies on a stochastic gradient descent with the Adam optimizer. I guess it could be speed up with a quasi-newtonian optimizer instead. However, you have to account that the method takes as input predictions from off-the-shelf models as preprocessed data and this could be a bottle-neck.

Thank you very much for your response. After reviewing your paper and code, I have noticed that you use DPT predictions as relative depth information for the SMPL model, but this is not actual depth. To my knowledge, current methods for reconstructing SMPL yield results that are not realistic (for instance, the actual height of a person, or their true position in the camera coordinate system). If I aim to convert the resulting SMPL into real values, how should I proceed? A known method involves post-optimization to decouple the camera, but I must also consider real-time constraints, making optimization methods unsuitable for my needs. Understanding that you have profound expertise in SMPL human body reconstruction, I would greatly appreciate your guidance on this matter. Thank you immensely.