/morefusion

MoreFusion: Multi-object Reasoning for 6D Pose Estimation from Volumetric Fusion

Primary LanguagePythonOtherNOASSERTION

MoreFusion

Multi-object Reasoning for 6D Pose Estimation from Volumetric Fusion

Kentaro Wada, Edgar Sucar, Stephen James, Daniel Lenton, Andrew J. Davison
Dyson Robotics Laboratory , Imperial College London
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020

Installation | Usage | Paper | Video | Website

MoreFusion is an object-level reconstruction system that builds a map with known-shaped objects, exploiting volumetric reconstruction of detected objects in a real-time, incremental scene reconstruction senario. The key components are:

  • Occupancy-based volumetric reconstruction of detected objects for model alignment in the later stage;
  • Volumetric pose prediction that exploits volumetric reconstruction and CNN feature extraction from the image observation;
  • Joint pose refinement of objects based on geometric consistency among objects and impenetrable space.

Installation

There're several options for installation:

Python project only

make install

ROS project for camera demonstration

mkdir -p ~/ros_morefusion/src
cd ~/ros_morefusion/src

git clone https://github.com/wkentaro/morefusion.git
cd morefusion
make install

cd ~/ros_morefusion
ln -s src/ros/*.sh .

./rosdep_install.sh
./catkin_build.robot_agent.sh

source .autoenv.zsh

ROS project for robotic demonstration

  • robot-agent: A computer for visual processing.
  • robot-node: A computer with real-time OS for Panda robot.

@robot-agent

Same as above instruction: ROS project for camera demonstration.

@robot-node

mkdir -p ~/ros_morefusion/src
cd ~/ros_morefusion/src

git clone https://github.com/wkentaro/morefusion.git

cd ~/ros_morefusion
ln -s src/ros/*.sh .

./catkin_build.robot_node.sh
source devel/setup.bash

rosrun franka_control_custom create_udev_rules.sh

Usage

Training & Inference

Pre-trained models are provided in the demos as following, so this process is optional to run the demos.

Instance Segmentation

cd examples/ycb_video/instance_segm
./download_dataset.py
mpirun -n 4 python train_multi.py  # 4-gpu training
./image_demo.py --model logs/XXX/XXX.npz

6D pose prediction

# baseline model (point-cloud-based)
cd examples/ycb_video/singleview_pcd
./download_dataset.py
./train.py --gpu 0 --centerize-pcd --pretrained-resnet18  # 1-gpu
mpirun -n 4 ./train.py --multi-node --centerize-pcd --pretrained-resnet18  # 4-gpu

# volumetric prediction model (3D-CNN-based)
cd examples/ycb_video/singleview_3d
./download_dataset.py
./train.py --gpu 0 --centerize-pcd --pretrained-resnet18 --with-occupancy  # 1-gpu
mpirun -n 4 ./train.py --multi-node --pretrained-resnet18 --with-occupancy  # 4-gpu
mpirun -n 4 ./train.py --multi-node --pretrained-resnet18  # w/o occupancy

# inference
./download_pretrained_model.py  # for downloading pretrained model
./demo.py logs/XXX/XXX.npz
./evaluate.py logs/XXX

Joint pose refinement

cd examples/ycb_video/pose_refinement
./check_icp_vs_icc.py  # press [s] to start

Camera demonstration

Static Scene

# using orb-slam2 for camera tracking
roslaunch morefusion_panda_ycb_video rs_rgbd.launch
roslaunch morefusion_panda_ycb_video rviz_static.desk.launch
roslaunch morefusion_panda_ycb_video setup_static.desk.launch

Figure 1. Static Scene Reconstruction with the Human Hand-mounted Camera.
# using robotic kinematics for camera tracking
roslaunch morefusion_panda_ycb_video rs_rgbd.robot.launch
roslaunch morefusion_panda_ycb_video rviz_static.robot.launch
roslaunch morefusion_panda_ycb_video setup_static.robot.launch

Figure 2. Static Scene Reconstruction with the Robotic Hand-mounted Camera.

Dynamic Scene

roslaunch morefusion_panda_ycb_video rs_rgbd.launch
roslaunch morefusion_panda_ycb_video rviz_dynamic.desk.launch
roslaunch morefusion_panda_ycb_video setup_dynamic.desk.launch

roslaunch morefusion_panda_ycb_video rs_rgbd.robot.launch
roslaunch morefusion_panda_ycb_video rviz_dynamic.robot.launch
roslaunch morefusion_panda_ycb_video setup_dynamic.robot.launch

Figure 3. Dynamic Scene Reconstruction with the Human Hand-mounted Camera.

Robotic Demonstration

Robotic Pick-and-Place

robot-agent $ sudo ntpdate 0.uk.pool.ntp.org  # for time synchronization
robot-node  $ sudo ntpdate 0.uk.pool.ntp.org  # for time synchronization

robot-node  $ roscore

robot-agent $ roslaunch morefusion_panda panda.launch

robot-node  $ roslaunch morefusion_panda_ycb_video rs_rgbd.robot.launch
robot-node  $ roslaunch morefusion_panda_ycb_video rviz_static.launch
robot-node  $ roslaunch morefusion_panda_ycb_video setup_static.robot.launch TARGET:=2
robot-node  $ rosrun morefusion_panda_ycb_video robot_demo_node.py
>>> ri.run()

Figure 4. Targetted Object Pick-and-Place. (a) Scanning the Scene; (b) Removing Distractor Objects; (c) Picking Target Object.

Citation

If you find MoreFusion useful, please consider citing the paper as:

@inproceedings{Wada:etal:CVPR2020,
  title={{MoreFusion}: Multi-object Reasoning for {6D} Pose Estimation from Volumetric Fusion},
  author={Kentaro Wada and Edgar Sucar and Stephen James and Daniel Lenton and Andrew J. Davison},
  booktitle={Proceedings of the {IEEE} Conference on Computer Vision and Pattern Recognition ({CVPR})},
  year={2020},
}