MSANet: Multi-Similarity and Attention Guidance for Boosting Few-Shot Segmentation

This is the official implementation of the paper MSANet: Multi-Similarity and Attention Guidance for Boosting Few-Shot Segmentation

Authors: Ehtesham Iqbal, Sirojbek Safarov, Seongdeok Bang

Abstract: Few-shot segmentation aims to segment unseen-class objects given only a handful of densely labeled samples. Prototype learning, where the support feature yields a singleor several prototypes by averaging global and local object information, has been widely used in FSS. However, utilizing only prototype vectors may be insufficient to represent the features for all training data. To extract abundant features and make more precise predictions, we propose a Multi-Similarity and Attention Network (MSANet) including two novel modules, a multi-similarity module and an attention module. The multi-similarity module exploits multiple feature-maps of support images and query images to estimate accurate semantic relationships. The attention module instructs the network to concentrate on class-relevant information. The network is tested on standard FSS datasets, PASCAL-5i 1-shot, PASCAL-5i 5-shot, COCO-20i 1-shot, and COCO-20i 5-shot. The MSANet with the backbone of ResNet-101 achieves the state-of-the-art performance for all 4-benchmark datasets with mean intersection over union (mIoU) of 69.13%, 73.99%, 51.09%, 56.80%, respectively.

Dependencies

Python 3.9
PyTorch 1.11.0
cuda 11.0
torchvision 0.8.1
tensorboardX 2.14

Datasets

Download PASCAL, COCO and Base annotation dataset and put in MSANet/data directrory.

PASCAL-5ⁱ: VOC2012 + SBD
COCO-20ⁱ: COCO2014
Download base annotation created by BAM from here

Download the data lists (.txt files) and put them into the MSANet/lists directory.

Models

Download the pre-trained backbones from here and put them into the MSANet/initmodel directory.
Download our trained base learners from OneDrive and put them under initmodel/PSPNet.
We provide all trained MSANet models for performance evaluation. Backbone: VGG16 & ResNet50; Dataset: PASCAL-5ⁱ & COCO-20ⁱ; Setting: 1-shot & 5-shot.

Scripts

Change configuration and add weight path to .yaml files in MSHNet/config , then run the test.py file for testing.

Performance

Performance comparison with the state-of-the-art approaches (i.e., HSNet, BAM and VAT in terms of average mIoU across all folds.

PASCAL-5ⁱ

Backbone	Method	1-shot	5-shot
VGG16	BAM	64.41	68.76
	MSANet(ours)	65.76 _(+1.35)	70.40 _(+1.64)
ResNet50	BAM	67.81	70.91
	MSANet(ours)	68.52 _(+0.71)	72.60 _(+1.69)
ResNet101	VAT	67.50	71.60
	MSANet(ours)	69.13 _(+1.63)	73.99 _(+2.39)

COCO-20ⁱ

Backbone Method 1-shot 5-shot

ResNet50 BAM 46.23 51.16

MSANet(ours) 48.03 _(+1.8) 53.67 _(+2.51)

ResNet101 HSNet 41.20 49.50

MSANet(ours) 51.09 _(+9.89) 56.80 _(+7.30)

Backbone	Method	1-shot	5-shot
ResNet50	BAM	46.23	51.16
	MSANet(ours)	48.03 _(+1.8)	53.67 _(+2.51)
ResNet101	HSNet	41.20	49.50
	MSANet(ours)	51.09 _(+9.89)	56.80 _(+7.30)

Visualization

References

This repo is mainly built based on PFENet, HSNet, and BAM. Thanks for their great work!

### BibTeX
If you find this research useful, please consider citing:
````BibTeX
@article{MSANet2022,
  title={MSANet: Multi-Similarity and Attention Guidance for Boosting Few-Shot Segmentation},
  author={Ehtesham Iqbal, Sirojbek Safarov, Seongdeok Bang},
  journal={arXiv preprint arXiv:2206.09667},
  year={2022}
}

AIVResearch/MSANet