/SPNet

Code for our CVPR2020 paper "Strip Pooling: Rethinking Spatial Pooling for Scene Parsing"

Primary LanguagePythonMIT LicenseMIT

Strip Pooling: Rethinking Spatial Pooling for Scene Parsing

This repository is a PyTorch implementation for our CVPR2020 paper (non-commercial use only).

The results reported in our paper are originally based on PyTorch-Encoding but the environment settings are a little bit complicated. To ease use, we reimplement our work based on semseg.

Strip Pooling

An efficient way to use strip pooling

Usage

Before training your own models, we recommend you to refer to the instructions described here. Then, you need to update the dataset paths in the configuration files.

Four GPUs with at least 11G memory on each are required for synchronized training. PyTorch (>=1.0.1) and Apex are required for Sync-BN support. For apex, just follow the "Quick Start" part to install it.

For pretrained models, you can download them from here (resnet50 and resnet101). Then, create a new folder "pretrained" and put the pretrained models in it, like

mkdir pretrained
mv downloaded_pretrained_model ./pretrained/

For training, just run

sh tool/train.py dataset_name model_name

For instance, in our case, you can run

sh tool/train.py ade20k spnet50

For test,

sh tool/test.py dataset_name model_name

At present, multi-GPU test is not supported. Will implement it later.

Better Results

After CVPR submission, we empirically found that replace the original expansion operation with bilinear interpolation in our strip pooling module results in better performance. This simple modification boosts the original performance on ADE20K from 45.60 as reported in our paper to 46.25, which sets a new state-of-the-art result.

We believe designing more complicated strip pooling module also benefits to the model performance.

Contact

If you are interested in this work and want to further investigate the techniques of pooling, you are welcome to contact me via andrewhoux@gmail.com.

Citation

You may want to cite:

@inproceedings{hou2020strip,
  title={{Strip Pooling}: Rethinking Spatial Pooling for Scene Parsing},
  author={Hou, Qibin and Zhang, Li and Cheng, Ming-Ming and Feng, Jiashi},
  booktitle={CVPR},
  year={2020}
}
@misc{semseg2019,
  author={Zhao, Hengshuang},
  title={semseg},
  howpublished={\url{https://github.com/hszhao/semseg}},
  year={2019}
}