/robot_sugar

Official implementation of "SUGAR: Pre-training 3D Visual Representations for Robotics" (CVPR'24).

Primary LanguagePython

SUGAR: Pre-training 3D Visual Representation for Robotics

This repository is the official implementation of SUGAR: Pre-training 3D Visual Representations for Robotics (CVPR 2024).

teaser

Install

See INSTALL.md for detailed instruction in installation.

Dataset

See DATASET.md for detailed instruction in dataset.

Pre-training

The pretrained checkpoints are available here.

  1. pre-training on single-object datasets
sbatch scripts/slurm/pretrain_shapenet_singleobj.slurm
sbatch scripts/slurm/pretrain_ensemble_singleobj.slurm
  1. pre-training on multi-object datasets
sbatch scripts/slurm/pretrain_shapenet_multiobj.slurm
sbatch scripts/slurm/pretrain_ensemble_multiobj.slurm

Zero-shot 3D object recognition

Evaluate on the modelnet, scanobjectnn and objaverse_lvis dataset with pretrained checkpoints.

sbatch scripts/slurm/downstream_cls_zeroshot.slurm

Robotic referring expression grounding

Train and evaluate on the ocidref and roborefit dataset. The trained models can be downloaded here.

sbatch scripts/slurm/downstream_ocidref.slurm
sbatch scripts/slurm/downstream_roborefit.slurm

Language-guided robotic manipulation

Train and evaluate on the RLBench 10 tasks. The trained model can be downloaded here.

sbatch scripts/slurm/rlbench_train_multitask_10tasks.slurm
sbatch scripts/slurm/rlbench_eval_val_split.slurm
sbatch scripts/slurm/rlbench_eval_tst_split.slurm

Citation

If you find this work useful, please consider citing:

@InProceedings{Chen_2024_SUGAR,
    author    = {Chen, Shizhe and Garcia, Ricardo and Laptev, Ivan and Schmid, Cordelia},
    title     = {SUGAR: Pre-training 3D Visual Representations for Robotics},
    booktitle = {CVPR},
    year      = {2024}
}