SeMask: Semantically Masked Transformers

Jitesh Jain, Anukriti Singh, Nikita Orlov, Zilong Huang, Jiachen Li, Steven Walton, Humphrey Shi

[arXiv] [pdf] [BibTeX]

This repo contains the code for our paper SeMask: Semantically Masked Transformers for Semantic Segmentation.

Results
Setup Instructions
Citing SeMask

1. Results

Note: † denotes the backbones were pretrained on ImageNet-22k and 384x384 resolution images.

ADE20K

Method	Backbone	Crop Size	mIoU	mIoU (ms+flip)	#params	config	Checkpoint
SeMask-T FPN	SeMask Swin-T	512x512	42.06	43.36	35M	config	checkpoint
SeMask-S FPN	SeMask Swin-S	512x512	45.92	47.63	56M	config	checkpoint
SeMask-B FPN	SeMask Swin-B^†	512x512	49.35	50.98	96M	config	checkpoint
SeMask-L FPN	SeMask Swin-L^†	640x640	51.89	53.52	211M	config	checkpoint
SeMask-L MaskFormer	SeMask Swin-L^†	640x640	54.75	56.15	219M	config	checkpoint
SeMask-L Mask2Former	SeMask Swin-L^†	640x640	56.41	57.52	222M	config	checkpoint
SeMask-L Mask2Former MSFaPN	SeMask Swin-L^†	640x640	56.54	58.22	224M	config	checkpoint
SeMask-L Mask2Former FaPN	SeMask Swin-L^†	640x640	56.97	58.22	227M	config	checkpoint

Cityscapes

Method	Backbone	Crop Size	mIoU	mIoU (ms+flip)	#params	config	Checkpoint
SeMask-T FPN	SeMask Swin-T	768x768	74.92	76.56	34M	config	checkpoint
SeMask-S FPN	SeMask Swin-S	768x768	77.13	79.14	56M	config	checkpoint
SeMask-B FPN	SeMask Swin-B^†	768x768	77.70	79.73	96M	config	checkpoint
SeMask-L FPN	SeMask Swin-L^†	768x768	78.53	80.39	211M	config	checkpoint
SeMask-L Mask2Former	SeMask Swin-L^†	512x1024	83.97	84.98	222M	config	checkpoint

COCO-Stuff 10k

Method	Backbone	Crop Size	mIoU	mIoU (ms+flip)	#params	config	Checkpoint
SeMask-T FPN	SeMask Swin-T	512x512	37.53	38.88	35M	config	checkpoint
SeMask-S FPN	SeMask Swin-S	512x512	40.72	42.27	56M	config	checkpoint
SeMask-B FPN	SeMask Swin-B^†	512x512	44.63	46.30	96M	config	checkpoint
SeMask-L FPN	SeMask Swin-L^†	640x640	47.47	48.54	211M	config	checkpoint

2. Setup Instructions

We provide the codebase with SeMask incorporated into various models. Please check the setup instructions inside the corresponding folders:

SeMask-FPN: Setup Instructions
SeMask-MaskFormer: Setup Instructions
SeMask-Mask2Former: Setup Instructions
SeMask-FaPN: Setup Instructions

3. Citing SeMask

@article{jain2021semask,
  title={SeMask: Semantically Masking Transformer Backbones for Effective Semantic Segmentation},
  author={Jitesh Jain and Anukriti Singh and Nikita Orlov and Zilong Huang and Jiachen Li and Steven Walton and Humphrey Shi},
  journal={arXiv},
  year={2021}
}

Acknowledgements

Code is based heavily on the following repositories: Swin-Transformer-Semantic-Segmentation, Mask2Former, MaskFormer and FaPN-full.

falozzo/SeMask-Segmentation