This repository contains an implementation of Grounded SAM in ROS as a service server.
Grounded Segment-Anything-Model (SAM) is a state-of-the-art model for detecting and segmenting objects in images. This implementation provides a ROS service server for utilizing the Grounded SAM model within ROS-based applications.
Check out the original repository of the model at IDEA-Research/Grounded-Segment-Anything.
- Tested on ROS Noetic, might work with other ROS distributions.
- A GPU with a minimum of 8 GB VRAM for Grounded SAM or 4 GB for the Grounding DINO model alone.
To install and use this ROS service server, follow these steps:
-
Clone this repository:
cd ~/catkin_ws/src git clone https://github.com/HashimHS/grounding_sam_ros.git
-
Install Grounding SAM environment:
# Navigate to the cloned repository directory cd grounding_sam_ros # Install the conda environment conda env create -f gsam.yaml conda activate gsam # Install the Grounding DINO git clone https://github.com/IDEA-Research/GroundingDINO.git cd GroundingDINO/ pip install -e . cd .. rm -rf GroundingDINO/ # Install SAM python -m pip install git+https://github.com/facebookresearch/segment-anything.git
-
Download the weights:
mkdir weights cd weights wget https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_l_0b3195.pth
-
Build the ROS workspace:
# Navigate to the root of your ROS workspace cd ~/catkin_ws # Build the workspace catkin build
To use the Grounded SAM ROS service server, follow these steps:
-
Launch the ROS node:
conda activate gsam roslaunch grounded_sam_ros gsam.launch
Alternatively you can launch Grounding DINO only for detection without segmentation
conda activate gsam roslaunch grounded_sam_ros dino.launch
You should now find a new service server with the name "vit_detection".
-
Use the service for segmenting objects in images. An example of client code:
from cv_bridge import CvBridge import cv2 text_prompt ='OBJECT YOU WANT TO DETECT' vit_detection = rospy.ServiceProxy('vit_detection', VitDetection) cv_bridge = CvBridge() rgb_msg = cv_bridge.cv2_to_imgmsg(np.array(rgb_image)) results = vit_detection(rgb_msg, text_prompt) # Annotated image from Grounding DINO annotated_frame = results.annotated_frame # List of detected objects labels = results.labels # Bounding boxes in y1 x1 y2 x2 format boxes = results.boxes # Detection score scores = results.scores # Image Segmentation Mask mask = results.segmask
In case you get an error:
bash ERROR: /usr/lib/x86_64-linux-gnu/libp11-kit.so.0: undefined symbol: ffi_type_pointer, version LIBFFI_BASE_7.0
You have to set LD_PRELOAD
environment variable before launching the node:
bash export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libffi.so.7
This ROS package is made possible by: