Efficient Manipulation with 3D Feature Splatting
-
Create the environment
Set up a conda/mamba/micromamba environment for the project:
micromamba create -n grasp_splats python=3.10 -c conda-forge micromamba activate grasp_splats
-
Install part-level feature splatting
Clone the repository and install the required components for part-level feature splatting:
git clone --recursive https://github.com/vuer-ai/feature-splatting-inria.git cd feature-splatting-inria git checkout roger/graspsplats_part # Install PyTorch and Torchvision with CUDA 11.8 support pip install torch==2.1.2+cu118 torchvision==0.16.2+cu118 --extra-index-url https://download.pytorch.org/whl/cu118 # Install CUDA Toolkit 11.8 micromamba install -c "nvidia/label/cuda-11.8.0" cuda-toolkit
Next, set up the submodules and required dependencies:
# Install diff-gaussian-rasterization submodule cd submodules/diff-gaussian-rasterization pip install -e . # Install simple-knn submodule cd ../.. cd submodules/simple-knn pip install -e . # Install remaining requirements cd ../.. pip install -r requirements.txt
If any errors occur, try the following fixes:
pip install numpy==1.23.5 # Downgrade to 'numpy<2' if there are compatibility issues pip install setuptools==69.5.1 # Resolves 'ImportError: cannot import name 'packaging' from 'pkg_resources''
-
Install Grasp Pose Detection (GPD)
To install GPD, first ensure OpenCV, Eigen, and PCL are properly set up by following the instructions in the GPD installation guide. After that, build GPD as follows:
cd gpd mkdir build && cd build cmake .. make -j8 # Utilize 8 cores to speed up the build process
If you only want to see results of object query and tracking, there's no need to install this part.
-
Install additional dependencies for grasping by query and visualization
Install the necessary Python packages for grasping by query and visualization:
pip install viser==0.1.10 roboticstoolbox-python transforms3d pip install panda_python # Choose the version based on your Franka robot setup; any version works for UI-based runs
-
Compute features and train the model
To compute object part features and perform feature splatting training:
python feature-splatting-inria/compute_obj_part_feature.py -s scene_data/example_data python feature-splatting-inria/train.py -s scene_data/example_data -m outputs/example_data --iterations 3000 --feature_type "clip_part"
Increasing the number of iterations can improve reconstruction quality, but higher iteration counts are not required for successful grasping.
-
Static scene grasping
For static grasping, run the following command:
python realbot_ui.py -m outputs/example_data
Then the UI would be on http://0.0.0.0:8080. Now you can use the UI to do text query and grasp sampling.
- Input texts and click "Query" to segment the objects.
- Click "Generate Global Grasps" to sample grasps in the whole scene, and then use "Filter with Gaussian" to clean grasps.
- Click "Generate Object Grasps" to directly get grasps near the object by cropping the gaussians first.
-
Dynamic scene tracking
Our code is based on Realsense camera. To use the code, you should first clone colmap_handeye. Then you can modify fg_obj_name_list in multi_object_tracking.py to track the objects.
python multi_object_tracking.py -m outputs/example_data
To use custom data, refer to colmap_handeye. This repository provides tools for dataset preparation and robot arm calibration. After obtaining the world2base
transformation matrix, copy it into the code to align the point cloud or Gaussian splats with the robot’s coordinate frame:
world2base = np.array([
[-0.4089165231525215, -0.8358961766325012, 0.3661486842582114, 0.42083348316217706],
[-0.9105881302403995, 0.34730407737749247, -0.22407394962882685, 0.20879287837427596],
[0.060137626808399375, -0.4250381861999404, -0.9031755123527864, 0.5594013590398528],
[0.0, 0.0, 0.0, 1.0],
])
This transformation converts the point cloud and Gaussian splats to the robot’s frame of reference for grasping tasks.
- Release codes for static scene grasping.
- Release example data of static scene grasping.
- Release codes for dynamic scene tracking.
- Release example data of dynamic scene tracking.
- Fix the bug of gaussians segmentor for better part-level query results.
The grasp sampling code has been adapted from GPD, an open-source grasp pose detection framework.
If you find this project useful, please consider citing the following paper:
@article{ji2024-graspsplats,
title={GraspSplats: Efficient Manipulation with 3D Feature Splatting},
author={Mazeyu Ji and Ri-Zhao Qiu and Xueyan Zou and Xiaolong Wang},
journal={arXiv preprint arXiv:2409.02084},
year={2024}
}