/Pascal-EA

Primary LanguagePythonMIT LicenseMIT

Benchmarking Segmentation Models with Mask-Preserved Attribute Editing

This codebase provides the official PyTorch implementation of our CVPR 2024 paper:

Benchmarking Segmentation Models with Mask-Preserved Attribute Editing
Zijin Yin, Kongming Liang, Bing Li, Zhanyu Ma, Jun Guo
In IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR) 2024

arXiv

TL;DR

We generate diverse synthetic samples by editing real images via diffusion models, and use synthetic-real pairs to evaluate semantic segmentation performances.


Install

Install mmsegmentation

pip install -U openmim
mim install mmengine
mim install "mmcv>=2.0.0"
pip install "mmsegmentation>=1.0.0"
pip install "mmdet>=3.0.0rc4"

Install diffusers and transformers.

pip install diffusers==0.17.1
pip install transformers==4.26.1

Get Started

Please refer to

Step 1. dataset_prepare.md for dataset preparation

Step 2. text_edit.md for image caption editing

Step 3. appear_edit.md for image appearance (color, material...) attributes editing (our conference mainly focuses on this part)

Step 4. geo_edit.md for object geometry (size, position) attributes editing (our journal extention mainly focuses on this part)

Step 5. filter.md for noisy filtering strategy.

Acknowledgment

Our code is built on top of several excellent research codebases and models, including PnP, LLAMA, and LLaVA, and additionally borrows mask filtering strategy from FreeMask, clip directional similarity metric from LANCE. Thanks for their contributions!