SATR: Zero-Shot Semantic Segmentation of 3D Shapes

Introduction

We explore the task of zero-shot semantic segmentation of 3D shapes by using large-scale off-the-shelf 2D im- age recognition models. Surprisingly, we find that modern zero-shot 2D object detectors are better suited for this task than contemporary text/image similarity predictors or even zero-shot 2D segmentation networks. Our key finding is that it is possible to extract accurate 3D segmentation maps from multi-view bounding box predictions by using the topological properties of the underlying surface. For this, we develop the Segmentation Assignment with Topological Reweighting (SATR) algorithm and evaluate it on ShapeNetPart and our proposed FAUST benchmarks. SATR achieves state-of-the-art performance and outperforms a baseline algorithm by 1.3% and 4% average mIoU on the FAUST coarse and fine-grained benchmarks, respectively, and by 5.2% average mIoU on the ShapeNetPart benchmark. Our source code and data will be publicly released. Project webpage: https://samir55.github.io/SATR/.

For additional detail, please see "SATR: Zero-Shot Semantic Segmentation of 3D Shapes"
by Ahmed Abdelreheem, Ivan Skorokhodov, Maks Ovsjanikov, and Peter Wonka
from KAUST and LIX, Ecole Polytechnique.

Installation

A. Create Environment

conda create -n meshseg python=3.9
conda activate meshseg
conda install cudatoolkit=11.1 -c conda-forge
pip install torch==1.10.1+cu111 torchvision==0.11.2+cu111 torchaudio==0.10.1 -f https://download.pytorch.org/whl/cu111/torch_stable.html

B. Build/Install Kaolin

At first, you may also try installing pre-built wheels found here. For example, you can run this command for CUDA 11.3 and PyTorch 1.10.0 for Kaolin 0.13.0

pip install kaolin==0.13.0 -f https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-1.10.0_cu111.html

But if this didn't work out for you, please do the steps below:

Clone Kaolin in some directory outside the repo.

git clone --recursive https://github.com/NVIDIAGameWorks/kaolin
cd kaolin

then,

git checkout v0.13.0 # optional
pip install -r tools/build_requirements.txt -r tools/viz_requirements.txt -r tools/requirements.txt
python setup.py develop

C. Install the package

git clone https://github.com/Samir55/SATR
cd SATR/
pip install -e .

D. Install GLIP

cd GLIP/
python setup.py build develop --user

NOTE: Download the pretrained GLIP model from here, and put it in GLIP/MODEL/

Datasets

For FAUST, please download the FAUST benchmark dataset from this link and put them in data\FAUST.
For the ShapeNetPart dataset, please download the labelled meshes from this link. We use the official test split provided here.

Code Running

Demo

Please create a suitable config file to run on an input mesh (see the configs folder for examples). For instance, to run on a penguin example, use the following command from the repository root directory:

CUDA_VISIBLE_DEVICES=0 python scripts/single_dataset_example.py -cfg configs/demo/penguin.yaml -mesh_name penguin.obj -output_dir outputs/demo/penguin

FAUST/ShapeNetPart

To run on a single example (for instance, tr_scan_000) of the FAUST dataset on the coarse segmentation, please use the following command

CUDA_VISIBLE_DEVICES=0 python scripts/single_dataset_example.py -cfg configs/faust/coarse.yaml -mesh_name tr_scan_000.obj -output_dir path_to_output_dir

and for the fine-grained segmentation

CUDA_VISIBLE_DEVICES=0 python scripts/single_dataset_example.py -cfg configs/faust/fine_grained.yaml -mesh_name tr_scan_000.obj -output_dir path_to_output_dir

For the ShapeNetPart models, please run scripts/single_dataset_example.py with the suitable config file for each category found in configs/shapenetpart

Evaluation

Given an output dir (for example coarse_output_dir) containing the coarse or fine-grained predictions for the 100 scans, run the following:

python scripts/evaluate_faust.py -output_dir outputs/coarse_output_dir

or for the fine_grained:

python scripts/evaluate_faust.py --fine_grained -output_dir outputs/fine_grained_output_dir

Credits

This codebase used some of 3DHighlighter, GLIP HuggingFace demo, and Grounded-Segment-Anything repositories. Thanks to the authors for their awesome work!

Citation

If you find this work useful in your research, please consider citing:

@article{abdelreheem2023SATR,
        author = {Abdelreheem, Ahmed and Skorokhodov, Ivan and Ovsjanikov, Maks and Wonka, Peter}
        title = {SATR: Zero-Shot Semantic Segmentation of 3D Shapes},
        journal = Computing Research Repository (CoRR),
        volume = {abs/2304.04909},
        year = {2023}
}