How to print the positation and orientation of the detection target in world frame?

Question

How to print the positation and orientation of the detection target in world frame?

Yukizyh opened this issue a year ago · 5 comments

Hi, I'm trying to calculate the accuracy of my own dataset and want to use coordinates in the world frame. The json file from inference.py output contains 'location' and 'quaternion_xyzw', but they are based on the camera frame. I tried using the 'camera_view_matrix' from generating my own dataset to calculate back to world coordinates, but the results differ from what was set. How can we obtain world coordinates? Thank you!

Answer 1 · 2023-07-26T16:22:37.000Z

Yeah I use to upload all of the world coordinate into the json file. Sorry I forgot to keep those. If you take the camera position and the camera rotation, built a matrix take the inverse and multiply the object transform you will get them in world space.

mtx_camera = make_from_q_p(q_camera,trans_camera)
mtx_object_in_camera = make_from_q_p(q_obj,trans_obj)
mtx_object_world = mtx_camera.inverse() * mtx_object_in_camera

you could use pyrr matrix 44 to do this. Build the quat from pyrr, e.g., pyrr.quaternion().matrix44 then append the translation.

Hope this helps.

Answer 2 · 2023-08-01T02:05:09.000Z

Thanks a lot for helping! I just want to ensure that are the 'location' and the 'quaternion_xyzw' we get from "inference.py' based on the camera coordinate system which we used to generate the data?

Answer 3 · 2023-08-01T17:36:39.000Z

Inference will be in the opencv coordinate frame. So similarly apply the same transform above.

Answer 4 · 2023-08-03T05:33:10.000Z

Hi, Tonton. Really thanks for your time.

But I still have some questions. The detected result I got from 'inference.py' only contains the intrinsics of the camera and the 'location', 'quaternion_xyzw', 'projected_cuboid' of the objects. So, if I want to use the above method to calculate the mtx_obj_world, do we need to know the mtx_camera base on the opencv coordinate? If so, how can I get that?

Because I tried to use the mtx_camera in the world coordinate system and do the calculation with the object position and rotation that we detected. But the obj_world_position and obj_world_quat are not the same as the value I set. (Which rotated 90 degrees by the y-axis.)

I am new to this so thanks for your patience!

Answer 5 · 2023-08-03T15:38:05.000Z

https://chat.openai.com/share/a9e30c79-f70a-456d-8e54-2ded88bacab0 here is a session that I just did that produces the solution you need.

Did you try to visualize the results inference.py produces with https://github.com/NVlabs/Deep_Object_Pose/blob/master/scripts/metrics/render_json.py this code I think answers most of your questions.