
[CVPR 2024 Highlight] Diffusion-EDFs: Bi-equivariant Denoising Generative Modeling on SE(3) for Visual Robotic Manipulation

[CVPR 2024 Highlight] Diffusion-EDFs

Official implementation of the paper 'Diffusion-EDFs: Bi-equivariant Denoising Generative Modeling on SE(3) for Visual Robotic Manipulation' (CVPR 2024, Highlight)

Project Website: https://sites.google.com/view/diffusion-edfs

Paper: https://arxiv.org/abs/2309.02685


Step 1. Clone Github repository.

git clone --recurse-submodules https://github.com/tomato1mule/diffusion_edf


You must RECURSIVELY clone the repositories. Please also use github LFS to clone demo and checkpoints directories. Without LFS, they would appear empty.

Step 2. Setup Conda/Mamba environment. We recommend using Mamba for faster installation.

conda install mamba -c conda-forge
mamba create -n diff_edf python=3.8
conda activate diff_edf

Step 3. Install Diffusion EDF.

mamba install -c conda-forge cxx-compiler==1.5.0
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 --extra-index-url https://download.pytorch.org/whl/cu117
pip install --no-index torch-scatter torch-sparse torch-cluster -f https://data.pyg.org/whl/torch-1.13.1+cu117.html
pip install -e .

Step 4. Install EDF Interface.

cd edf_interface
pip install -e . # If error occurs, please check in step 1 you have correctly cloned with '--recurse-submodules' flag.
cd ..



Open the evaluate_<task_name>.ipynb file using jupyter notebook to see how Diffusion-EDFs work.


We provide three real-world manipulation examples with Franka Panda robot.

  • evaluate_real_mug.ipynb
  • evaluate_real_bowl.ipynb
  • evaluate_real_bottle.ipynb


bash scritps/<task_name>/train.bash


To see logs for running experiments, use tensorboard:

tensorboard --logdir=./runs



  • scene_input, grasp_input: FeaturedPoints (NamedTuple)
    • FeaturedPoints.x: 3d position of the points; Shape: (nPoints, 3)
    • FeaturedPoints.f: Feature vector of the points; Shape: (nPoints, dim_feature)
    • FeaturedPoints.b: Minibatch index of each points. Currently all set to zero; Shape:(nPoints,)
    • FaturedPoints.w: Optional point attention value; Shape: (nPoints, )
  • T_seed: Initial pose to start denoising process; Shape: (nPoses, 7)
    • T_seed[..., :4]: Quaternion (qw, qx, qy, qz)
    • T_seed[..., 4:]: Position (x, y, z)


Properly setting the unit system for position is crucial. In this code, centimeter unit is used for the model. For example, the distance between two points (x=0., y=0., z=0.) and (x=1., y=0., z=0.) is 1cm.


Demonstration files are saved in meter units. Therefore, rescaling is defined in the train_configs.yaml. For example,


rescale_factor: &rescale_factor 100.0 # Meters to Centimeters
  - name: "Downsample"
      voxel_size: 0.01 # In Meters
      coord_reduction: "average"
  # ...
  - name: "Rescale"
      rescale_factor: *rescale_factor

Note that the voxel size of the voxel downsample filter in the above config file is 0.01m = 1cm. Note that 'train_configs.yaml' and 'task_configs.yaml' use meter units while 'score_model_configs' use centimeter units.


