willowgarage/interactive-manipulation-sandbox

Developing a pointcloud aggregation node

Opened this issue · 1 comments

The pointcloud aggregator subscribes to a point cloud/depth image and generates a aggregated pointcloud that can be displayed in rviz/the webGUI.

Role

  • the primary role of this component is for visualization and interaction, i.e. to be rendered to a teleoperator in order to give her better environmental awareness
  • we would like to aggregate and display colored points from multiple camera views because
    • the Kinect has very limited f.o.v. and we'd like the operator to see more than just that
    • while manipulating, the robot almost always occludes the part of the scene that is being manipulated, and the operator still needs to see it
  • there are two types of aggregation that need to be performed
    • spatial aggregation
    • temporal aggregation

Functionality

  • combine point clouds from multiple camera viewpoints and over time in an intelligent fashion
    • do not delete parts of the scene occluded by the robot
  • handle change in the scene in an intelligent fashion
    • get rid of points that the robot can see are no longer there
    • investigate a "decay" time for general points so that scene evolves over time. This is an interesting UI problem.
    • investigate averaging color over time
  • ideally, do not display the robot (robot self-filter)
    • a robot self-filter is an independent component which could be placed in front of the PCAgg
    • if the PCAgg operates in 2D image space (which is very likely) we would need a robot self filter that does the same; not clear what the status of that is

User Interface

  • the output point cloud should be visible in either the RViz or the browser interfaces
    • likely to be an independent node producing a point cloud
    • transmission to browser is still a tricky problem, as output is very likely to be an unorganized point cloud.
  • output point cloud should be interactive, i.e. user should be able to click on it, and system should recover information about where the user has clicked
    • mostly independent of the aggregation problem, but something to keep an eye on

Notes

  • unlike Octomap, we only care about surface information and not volumetric (unknown / known empty) areas. This might enable different techniques.

Interface

  • input
    • depth image
    • rgb image
    • camera info
    • tf
  • output
    • point cloud