MIT-SPARK/Hydra

[QUESTION] Scaling and rotation of objects

drawdehe opened this issue · 4 comments

Hello!

I have a question regarding scaling and rotation of objects. Running the uHumans2-dataset I am able to retrieve the position of the objects by listening to the topic /hydra_dsg_visualizer/dsg_markers and extracting it from the Pose object. However, I am trying to retrieve the scaling and rotation of the objects using the same topic (scaling from the Marker object and rotation from the Pose object), but I cannot seem to get the correct information. Is this the correct approach or am I doing something wrong? It would be greatly appreciated with further help.

Thank you!

Thanks for your interest in our work! I would not recommend parsing information about objects from the visualization messages (unless there's something specific to your application that would make that necessary). Hydra publishes the scene graph via a ROS topic and you can set up a DsgReceiver object to listen to this (see here for an example of how the visualizer accomplishes this). You can then grab information about the object nodes directly from the scene graph.

That said, there are a couple important clarifications that may also help when working with the visualization messages:

  • We do NOT compute the pose of the objects directly (please see our paper for details). We instead compute a bounding box of the object mesh vertices that is either axis-aligned or is some approximation of the min-volume bounding box (albeit with a yaw-only rotation). uhumans2 defaults to axis-aligned bounding boxes so all the poses would be identity.
  • The visualizer uses a single line-list marker to draw the wireframe bounding boxes of the objects. This may be where your confusion is coming from: you would have to back out which points comprise each bounding box and then compute the scale and 2D orientation (if applicable) from the corners of the bounding box
  • The positions in the visualization markers will have an extra z offset applied from the true detected positions of the objects

Thank you for the answer! If I want to get the object information without using visualization, how would I approach getting it? For instance, how do I extract the object mesh vertices?

It would roughly look something like this:

spark_dsg::DynamicSceneGraph graph; // from the dsg receiver or elsewhere
const auto& objects = graph.getLayer(spark_dsg::DsgLayers::OBJECTS);
for (auto&& [id, node] : objects.nodes()) {
  const ObjectNodeAttributes& attrs = node.attributes<spark_dsg::ObjectNodeAttributes>();
  // attrs.mesh_connnections would contain the indices of the mesh vertices connected to the node
}

In general, every node in the graph has a set of attributes (which for the objects are defined here). The visualizer is probably a good reference for how these attributes get used (e.g., here is where the bounding boxes are drawn from the provided attributes)).

Thank you for the help!