Details of Image Segmentation with SAM
ramdrop opened this issue · 2 comments
Thanks for sharing this cool project! I was confused by how you segment 2D image using VFMs:
As a result, SAM is able to segment images, with either point, box, or mask prompts, across different domains and data distributions. (from 6.2 Vision Foundation Models)
What did you feed to SAM to get the final segmented 2D image?
Thanks for your explanation.
Thanks for your interest in our work!
We feed each RGB image from the multi-view cameras to SAM (and other VFMs) for generating the one-channel image (of the same size as the input image), where each pixel has a mask ID corresponding to a distinct superpixel.
See the figures below for an example.
Input | Output |
---|---|
The code for generating superpixels via the used VFMs will be available soon. Kindly refer to our code for more details.
We will also upload our generated superpixels to Google Drive later. Stay tuned!
Hi @ramdrop, the code for generating semantic superpixels on the nuScenes dataset is out. Kindly refer to SUPERPOINT.md for the detailed instructions.
Our generated superpixels will be available very soon.