Project Page | PDF | arXiv |
Tianxing Chen*, Yao Mu* †, Zhixuan Liang*, Zanxin Chen, Shijia Peng, Qiangyu Chen, Mingkun Xu, Ruizhen Hu, Hongyuan Zhang, Xuelong Li, Ping Luo†.
We present G3Flow, a novel approach that leverages foundation models to generate and maintain 3D semantic flow for enhanced robotic manipulation.
See INSTALLATION.md for installation instructions. It takes about 30 minutes for installation.
This step involves data collection on RoboTwin for different tasks, with each task collecting 100 sets of data, including point cloud and RGBD data.
${task_name}: bottle_adjust_T
, bottle_adjust_G
, diverse_bottles_pick_G
, shoe_place_T
, shoe_place_G
, shoes_place_T
, shoes_place_G
, tool_adjust_T
, tool_adjust_G
.
cd RoboTwin_Benchmark
bash run_task.sh ${task_name} ${gpu_id}
cd ..
This step will process the raw data to obtain G3Flow data for each moment, as well as a PCA model. The n_component
parameter refers to the target dimensionality when using PCA for dimensionality reduction.
bash process_data.sh ${task_name} ${expert_data_num} ${n_components} ${gpu_id}
The processed data will be stored in the G3FlowDP/data
directory, and the obtained PCA model will be stored in the G3FlowDP/PCA_model
directory.
bash train.sh ${task_name} ${expert_data_num} ${n_components} ${seed} ${gpu_id}
bash eval.sh ${task_name} ${expert_data_num} ${n_components} ${seed} ${gpu_id}
If you find our work useful, please consider citing:
@article{chen2024g3flow,
title={G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation},
author={Chen, Tianxing and Mu, Yao and Liang, Zhixuan and Chen, Zanxin and Peng, Shijia and Chen, Qiangyu and Xu, Mingkun and Hu, Ruizhen and Zhang, Hongyuan and Li, Xuelong and others},
journal={arXiv preprint arXiv:2411.18369},
year={2024}
}
Our code is generally built upon: Diffusion Policy, FoundationPose, Grounded-SAM, DP3. We thank all these authors for their nicely open sourced code and their great contributions to the community.
Contact Tianxing Chen if you have any questions or suggestions.
This repository is released under the MIT license. See LICENSE for additional details.