Xiaowen Ma1,2, Zhenliang Ni1, Xinghao Chen1
1 Huawei Noah’s Ark Lab, 2 Zhejiang University
2024/09/30
: We fix some bugs in the code. Currently, this repository supports several baselines and the corresponding +SSA-Seg versions as follows.
Backbone | HRNet (CVPR'19) | MiT (NeurIPS'21) | Swin (ICCV'21) | AFFormer (AAAI'23) | SeaFormer (ICLR'23) | MSCAN (NeurIPS'22) | EfficientFormerV2 (ICCV'23) |
---|---|---|---|---|---|---|---|
Head | OCRNet (ECCV'20) | SegFormer (NeurIPS'21) | UperNet (ECCV'18) | Afformer (AAAI'23) | SeaFormer (ICLR'23) | SegNext (NeurIPS'22) | CGRSeg (ECCV'24) |
2024/09/26
: SSA-Seg is accepted by NeurIPS2024!
SSA-Seg is an effecient and powerful pixel-level classifier, which significantly improves the segmentation performance of various baselines with a negligible increase in computational cost. It has three key parts: semantic prototype adaptation (SEPA), spatial prototype adaptation (SPPA), and online multi-domain distillation.
Iters: 160000 Input size: 512x512 Batch size: 16
-
General models
+SSA-Seg Backbone Latency (ms) Params(M) Flops (G) mIoU (ss) OCRNet HRNet-W48 69.3 8.7 165.0 47.67 UperNet Swin-T 54.3 61.1 236.3 47.56 SegFormer MiT-B5 70.1 82.3 52.6 50.74 UperNet Swin-L 107.3 234.9 405.2 52.69 ViT-Adapter ViT-Adapter-L 284.9 364.9 616.3 55.39 -
Light weight models
+SSA-Seg Backbone Latency (ms) Params (M) Flops (G) mIoU (ss) AFFormer-B AFFormer-B 26.0 3.3 4.4 42.74 SeaFormer-B SeaFormer-B 27.3 8.8 1.8 42.46 SegNext-T MSCAN-T 23.3 4.6 6.3 43.90 SeaFormer-L SeaFormer-L 29.9 14.2 6.4 45.36 CGRSeg-B EfficientFormerV2-S2 36.0 19.3 7.6 47.10 CGRSeg-L EfficientFormerV2-L 42.6 35.8 14.8 49.00
Iters: 80000 Input size: 512x512 Batch size: 16
-
General models
+SSA-Seg Backbone Latency (ms) Params (M) Flops (G) mIoU (ss) OCRNet HRNet-W48 69.3 8.7 165.0 37.94 UperNet Swin-T 54.3 61.1 236.3 42.30 SegFormer MiT-B5 70.1 82.3 52.6 45.55 UperNet Swin-L 107.3 234.9 405.2 48.94 ViT-Adapter ViT-Adapter-L 284.9 364.9 616.3 51.2 -
Light weight models
+SSA-Seg Backbone Latency (ms) Params (M) Flops (G) mIoU (ss) AFFormer-B AFFormer-B 26.0 3.3 4.4 36.40 SeaFormer-B SeaFormer-B 27.3 8.8 1.8 35.92 SegNext-T MSCAN-T 23.3 4.6 6.3 38.91 SeaFormer-L SeaFormer-L 29.9 14.2 6.4 38.48
Iters: 80000 Input size: 480x480 Batch size: 16
-
General models
+SSA-Seg Backbone Latency (ms) Params (M) Flops (G) mIoU (ss) OCRNet HRNet-W48 69.3 8.7 143.3 50.21 UperNet Swin-T 54.3 61.1 207.7 55.11 SegFormer MiT-B5 70.1 82.3 45.8 59.14 UperNet Swin-L 107.3 234.9 363.2 61.83 ViT-Adapter ViT-Adapter-L 284.9 364.9 616.3 66.05 -
Light weight models
+SSA-Seg Backbone Latency (ms) Params (M) Flops (G) mIoU (ss) AFFormer-B AFFormer-B 26.0 3.3 4.4 49.72 SeaFormer-B SeaFormer-B 27.3 8.8 1.8 47.00 SegNext-T MSCAN-T 23.3 4.6 6.3 52.58 SeaFormer-L SeaFormer-L 29.9 14.2 6.4 49.66
-
Environment
conda create --name ssa python=3.8 -y conda activate ssa pip install torch==1.8.2+cu102 torchvision==0.9.2+cu102 torchaudio==0.8.2 pip install timm==0.6.13 pip install mmcv-full==1.7.0 pip install opencv-python==4.1.2.30 pip install "mmsegmentation==0.30.0"
SSA-Seg is built based on mmsegmentation-0.30.0, which can be referenced for data preparation.
-
Train
# Single-gpu training python train.py configs/swin/upernet_swin_tiny_ade20k_ssa.py # Multi-gpu (4-gpu) training bash dist_train.sh configs/swin/upernet_swin_tiny_ade20k_ssa.py 4
-
Test
# Single-gpu testing python test.py configs/swin/upernet_swin_tiny_ade20k_ssa.py ${CHECKPOINT_FILE} --eval mIoU # Multi-gpu (4-gpu) testing bash dist_test.sh configs/swin/upernet_swin_tiny_ade20k_ssa.py ${CHECKPOINT_FILE} 4 --eval mIoU
-
Benchmark
python benchmark.py configs/swin/upernet_swin_tiny_ade20k_ssa.py ${CHECKPOINT_FILE} --repeat-times 5
If you are interested in our work, please consider giving a 🌟 and citing our work below.
@inproceedings{
ssaseg,
title={{SSA}-Seg: Semantic and Spatial Adaptive Pixel-level Classifier for Semantic Segmentation},
author={Xiaowen Ma and Zhen-Liang Ni and Xinghao Chen},
booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
year={2024},
url={https://openreview.net/forum?id=RZZo23pQFL}
}
Thanks to previous open-sourced repo: SeaFormer CAC AFFormer SegNeXt mmsegmentation CGRSeg ViT-Adapter