AerialFormer: Multi-resolution Transformer for Aerial Image Segmentation

[arXiv] [pdf]

Aerial Image Segmentation is a top-down perspective semantic segmentation and has several challenging characteristics such as strong imbalance in the foreground-background distribution, complex background, intra-class heterogeneity, inter-class homogeneity, and small objects. To handle these problems, we inherit the advantages of Transformers and propose AerialFormer, which unifies Transformers at the contracting path with lightweight Multi-Dilated Convolutional Neural Networks (MD-CNNs) at the expanding path. AerialFormer is designed as a hierarchical structure, in which Transformer encoder outputs multi-scale features and MD-CNNs decoder aggregates information from the multi-scales. Thus, it takes both local and global context into consideration to render powerful representations and high-resolution segmentation. We have benchmarked AerialFormer on three common datasets including iSAID, LoveDA, and Postdam. Comprehensive experiments and extensive ablation studies show that our proposed AerialFormer outperforms previous state-of-the-art methods with remarkable performance.

Introduction

Our code is implemented on mmsegmentation and its update is rapid. Please keep in mind you're using the same/compatibility version. Please refer to get_started for installation and dataset_prepare for dataset preparation on mmsegmentation. However, NOT all of codes are the same (e.g. Potsdam dataset)

Data preparation

Since some datasets don't allow to redistribute them, You need to get prepared the zip files. Please check mmsegmentation/dataset_prepare to get zip files.

After that, please run the following commands to prepare for datasets(iSAID, LoveDA, Potsdam)

iSAID

Download the original images from DOTA and annotations from iSAID. Put your dataset source file in one directory. For more details, check iSAID DevKit.

python tools/convert_datasets/isaid.py /path/to/potsdam

Potsdam

For Potsdam dataset, please run the following command to re-organize the dataset. Put your dataset source file in one directory. We used '2_Ortho_RGB.zip' and '5_Labels_all_noBoundary.zip'.

With Clutter. Number of class is 6 classes.

python tools/convert_datasets/potsdam.py /path/to/potsdam

Without Clutter. Number of class is 5 classes.

python tools/convert_datasets/potsdam_no_clutter.py /path/to/potsdam

Note that we changed some settings from the original convert_dataset code from mmsegmentation.

LoveDA

Download the dataset from Google Drive here. For LoveDA dataset, please run the following command to re-organize the dataset.

python tools/dataset_converters/loveda.py /path/to/loveDA

More details about LoveDA can be found here.

Get Started (singularity/non-singularity)

We use mmcv-full=="1.7.1 and mmsegmentation==0.30.0. Please follow the other dependencies to mmsegmentation.

Singularity Option

If you've not installed it, please refer to AICV to install singularity.

Environment Setup

Build Image from docker/Dockerfile

export REGISTRY_NAME="user"
export IMAGE_NAME="aerialformer"
docker build -t $REGISTRY_NAME/$IMAGE_NAME docker/ # You can use 'thanyu/aerialformer'

Training

Single GPU

export DATAPATH="path/to/data" #If you do not specify, it'll be "$PWD/data"
bash tools/singularity_train.sh configs/path/to/config

For example, to run AerialFormer-T on iSAID dataset:

bash tools/singularity_train.sh configs/aerialformer/aerialformer_tiny_512x512_loveda.py

Multi GPUs

export DATAPATH="path/to/data" #If you do not specify, it'll be "$PWD/data"
bash tools/singularity_train.sh configs/path/to/config

For example, to train AerialFormer-S on LoveDA dataset:

bash tools/singularity_dist_train.sh configs/aerialformer/aerialformer_small_512x512_loveda.py 2

Evaluation

Single GPU

bash tools/singularity_test.sh configs/path/to/config work_dirs/path/to/trained_weight --eval metrics

For example, to test AerialFormer-T on Loveda dataset

bash tools/singularity_test.sh configs/aerialformer/aerialformer_tiny_512x512_loveda.py work_dirs/aerialformer_tiny_512x512_loveda/2023_0101_0000/latest.pth --eval mIoU

Multi GPUs

bash tools/singularity_dist_test.sh configs/path/to/config work_dirs/work_dirs/path/to/trained_weight 2 --eval metrics

For example, to test AerialFormer-S on Loveda dataset

bash tools/singularity_dist_test.sh work_dirs/aerialformer_small_512x512_loveda/2023_0612_1009/aerialformer_small_512x512_loveda.py work_dirs/aerialformer_small_512x512_loveda/2023_0612_1009/latest.pth 2 --eval mIoU

Non-singularity Option

Environment Setup

STEP 1. Run and install mmsegmentation by the following code.

For more information, refer to mmsegmentaiton/get_started.

pip install -U openmim && mim install mmcv-full=="1.7.1"
pip install mmsegmentation==0.30.0

STEP 2. Clone this repository and install.

git clone https://github.com/UARK-AICV/AerialFormer.git
cd AerialFormer
pip install -v -e .

Training

Single GPU

python tools/train.py configs/path/to/config

For example, to train AerialFormer-T on LoveDA dataset:

python tools/train.py configs/aerialformer/aerialformer_tiny_512x512_loveda.py

Multi GPUs

bash tools/dist_train.sh configs/path/to/config num_gpus

For example, to train AerialFormer-B on LoveDA dataset on two gpus:

bash tools/dist_train.sh configs/aerialformer/aerialformer_base_512x512_loveda.py 2

Note batch size matters. We're using 8 batch sizes.

Evaluation

Single GPU

python tools/test.py configs/path/to/config work_dirs/path/to/checkpoint --eval metrics

For example , to test AerialFormer-T on Loveda dataset

python tools/test.py work_dirs/aerialformer_tiny_512x512_loveda/2023_0101_0000/aerialformer_tiny_512x512_loveda.py work_dirs/aerialformer_tiny_512x512_loveda/2023_0101_0000/latest.pth --eval mIoU

Multi GPUs

bash tools/dist_test.py configs/path/to/config work_dirs/path/to/checkpoint num_gpus --eval metrics

For example , to test AerialFormer-T on Loveda dataset

bash tools/dist_test.py work_dirs/aerialformer_tiny_512x512_loveda/2023_0101_0000/aerialformer_tiny_512x512_loveda.py work_dirs/aerialformer_tiny_512x512_loveda/2023_0101_0000/latest.pth 2 --eval mIoU

Acknowledgement

We thank the following open sourced project(s).

mmsegmentation

Citation

If you find this work helpful, please consider citing the following paper:

@article{yamazaki2023aerialformer,
  title={AerialFormer: Multi-resolution Transformer for Aerial Image Segmentation},
  author={Yamazaki, Kashu and Hanyu, Taisei and Tran, Minh and Garcia, Adrian and Tran, Anh and McCann, Roy and Liao, Haitao and Rainwater, Chase and Adkins, Meredith and Molthan, Andrew and others},
  journal={arXiv preprint arXiv:2306.06842},
  year={2023}
}