Authors: Peiqi Liu*, Yaswanth Orru*, Jay Vakil, Chris Paxton, Mahi Shafiuallah†, Lerrel Pinto†
* equal contribution, † equal advising.
OK-Robot is a zero-shot modular framework that effectively combines the state-of-art navigation and manipulation models to perform pick and place tasks in real homes. It has been tested in 10 real homes on 170+ objects and achieved a total success rate of 58.5%.
github_video.mp4
Hardware required:
- An iPhone Pro with Lidar sensors
- Hello Robot Stretch with Dex Wrist installed
- A workstation with GPU to run pretrained models
Software required:
- Python 3.9
- Record3D (>1.18.0)
- CloudComapre
- You need to get anygrasp license and checkpoint.
- Install the necessary environment on workstation to run the navigation and manipulation modules
- Verify the workspace installation once the above steps are completed.
- Install the necessary packages on robot to be able to properly communicate with backend workstation.
- You might also need to get a new calibrated URDF for accurate robot manipulation.
Once both the robot and workstation are complete. You are good to start the experiments.
First set up the environment with the tapes, position the robot properly and scan the environment to get a r3d file from Record3D. Place it in /navigation/r3d/
run following commands.
In one terminal run the Navigation Module.
mamba activate ok-robot-env
cd ok-robot-navigation
python path_planning.py debug=False min_height={z coordinates of the ground tapes + 0.1} dataset_path='r3d/{your_r3d_filename}.r3d' cache_path='{your_r3d_filename}.pt' pointcloud_path='{your_r3d_filename}.ply'
In another terminal run the Manipulation module
mamba activate ok-robot-env
cd ok-robot-manipulation/src
python demo.py --open_communication --debug
Before running anything on the robot, you need to calibrate it by
stretch_robot_home.py
Our robot codes rely on robot controllers provided by home-robot. Just like running other home-robot based codes, you need to run two processes synchronously in two terminals.
In one terminal start the home-robot
roslaunch home_robot_hw startup_stretch_hector_slam.launch
In another terminal run the robot control. More details in ok-robot-hw
cd ok-robot-hw
python run.py -x1 [x1] -y1 [y1] -x2 [x2] -y2 [y2] -ip [your workstation ip]
If you find this work useful, please consider citing:
@article{liu2024okrobot,
title={OK-Robot: What Really Matters in Integrating Open-Knowledge Models for Robotics},
author={Liu, Peiqi and Orru, Yaswanth and Paxton, Chris and Shafiullah, Nur Muhammad Mahi and Pinto, Lerrel},
journal={arXiv preprint arXiv:2401.12202},
year={2024}
}
Our work is reliant on a lot of other publications and open source projects, if you find a particular component useful, please consider citing the original authors as well.
List of citations
@article{fang2023anygrasp,
title={Anygrasp: Robust and efficient grasp perception in spatial and temporal domains},
author={Fang, Hao-Shu and Wang, Chenxi and Fang, Hongjie and Gou, Minghao and Liu, Jirong and Yan, Hengxu and Liu, Wenhai and Xie, Yichen and Lu, Cewu},
journal={IEEE Transactions on Robotics},
year={2023},
publisher={IEEE}
}
@article{minderer2024scaling,
title={Scaling open-vocabulary object detection},
author={Minderer, Matthias and Gritsenko, Alexey and Houlsby, Neil},
journal={Advances in Neural Information Processing Systems},
volume={36},
year={2024}
}
@article{yenamandra2023homerobot,
title={HomeRobot: Open-Vocabulary Mobile Manipulation},
author={Yenamandra, Sriram and Ramachandran, Arun and Yadav, Karmesh and Wang, Austin and Khanna, Mukul and Gervet, Theophile and Yang, Tsung-Yen and Jain, Vidhi and Clegg, Alexander William and Turner, John and others},
journal={arXiv preprint arXiv:2306.11565},
year={2023}
}
While OK-Robot can do quite a bit by itself, we think there are plenty of room for improvement for a zero-shot, home-dwelling robot. That's why we consider OK-Robot a living release, and will try to occassionally add new features to this. We also encourage you to take a look at the list below, and if you are interested, share your improvements with the community by contributing to this project.
- Create OK-Robot, a shared platform for a zero-shot, open-vocab pick-and-place robot.
- Integrate grasping primitive with AnyGrasp.
- Integrate open-vocabulary navigation with VoxelMap.
- Integrate heuristic based dropping.
- Improve documentation.
- Add error detection/recovery from failure while manipulating.
- Figure out interactive navigation: if an object is not found or a query is ambiguous, ask the end-user.
- Integrate with an open-source grasp perception model so that we can MIT-license all the dependencies.