/RotationDetection

This is a tensorflow-based rotation detection benchmark, also called AlphaRotate.

Primary LanguagePythonApache License 2.0Apache-2.0

AlphaRotate: A Rotation Detection Benchmark using TensorFlow

Documentation Status PyPI Downloads License Average time to resolve an issue Percentage of issues still open

πŸš€πŸš€πŸš€ News: MMRotate has been released at https://github.com/open-mmlab/mmrotate πŸš€πŸš€πŸš€

Abstract

AlphaRotate is mainly maintained by Xue Yang with Shanghai Jiao Tong University supervised by Prof. Junchi Yan.

Papers and codes related to remote sensing/aerial image detection: DOTA-DOAI .

Techniques:

The above-mentioned rotation detectors are all modified based on the following horizontal detectors:

3

Projects

0

Latest Performance

DOTA (Task1)

Baseline

Backbone Neck Training/test dataset Data Augmentation Epoch NMS
ResNet50_v1d 600->800 FPN trainval/test Γ— 13 (AP50) or 17 (AP50:95) is enough for baseline (default is 13) gpu nms (slightly worse <1% than cpu nms but faster)
Method Baseline DOTA1.0 DOTA1.5 DOTA2.0 Model Anchor Angle Pred. Reg. Loss Angle Range Configs
- RetinaNet-R 67.25 56.50 42.04 Baidu Drive (bi8b) R Reg. (βˆ†β¬) smooth L1 [-90,0) dota1.0, dota1.5, dota2.0
- RetinaNet-H 64.17 56.10 43.06 Baidu Drive (bi8b) H Reg. (βˆ†β¬) smooth L1 [-90,90) dota1.0, dota1.5, dota2.0
- RetinaNet-H 65.33 57.21 44.58 Baidu Drive (bi8b) H Reg. (sin⍬, cos⍬) smooth L1 [-90,90) dota1.0, dota1.5, dota2.0
- RetinaNet-H 65.73 58.87 44.16 Baidu Drive (bi8b) H Reg. (βˆ†β¬) smooth L1 [-90,0) dota1.0, dota1.5, dota2.0
IoU-Smooth L1 RetinaNet-H 66.99 59.17 46.31 Baidu Drive (qcvc) H Reg. (βˆ†β¬) iou-smooth L1 [-90,0) dota1.0, dota1.5, dota2.0
RIDet RetinaNet-H 66.06 58.91 45.35 Baidu Drive (njjv) H Quad. hungarian loss - dota1.0, dota1.5, dota2.0
RSDet RetinaNet-H 67.27 61.42 46.71 Baidu Drive (2a1f) H Quad. modulated loss - dota1.0, dota1.5, dota2.0
CSL RetinaNet-H 67.38 58.55 43.34 Baidu Drive (sdbb) H Cls.: Gaussian (r=1, w=10) smooth L1 [-90,90) dota1.0, dota1.5, dota2.0
DCL RetinaNet-H 67.39 59.38 45.46 Baidu Drive (m7pq) H Cls.: BCL (w=180/256) smooth L1 [-90,90) dota1.0, dota1.5, dota2.0
- FCOS 67.69 61.05 48.10 Baidu Drive (pic4) - Quad smooth L1 - dota1.0, dota1.5, dota2.0
RSDet++ FCOS 67.91 62.18 48.81 Baidu Drive (8ww5) - Quad modulated loss - dota1.0, dota1.5 dota2.0
GWD RetinaNet-H 68.93 60.03 46.65 Baidu Drive (7g5a) H Reg. (βˆ†β¬) gwd [-90,0) dota1.0, dota1.5, dota2.0
GWD + SWA RetinaNet-H 69.92 60.60 47.63 Baidu Drive (qcn0) H Reg. (βˆ†β¬) gwd [-90,0) dota1.0, dota1.5, dota2.0
BCD RetinaNet-H 71.23 60.78 47.48 Baidu Drive (0puk) H Reg. (βˆ†β¬) bcd [-90,0) dota1.0, dota1.5, dota2.0
KLD RetinaNet-H 71.28 62.50 47.69 Baidu Drive (o6rv) H Reg. (βˆ†β¬) kld [-90,0) dota1.0, dota1.5, dota2.0
KFIoU RetinaNet-H 70.64 62.71 48.04 Baidu Drive (o72o) H Reg. (βˆ†β¬) kfiou [-90,0) dota1.0, dota1.5, dota2.0
R3Det RetinaNet-H 70.66 62.91 48.43 Baidu Drive (n9mv) H->R Reg. (βˆ†β¬) smooth L1 [-90,0) dota1.0, dota1.5, dota2.0
DCL R3Det 71.21 61.98 48.71 Baidu Drive (eg2s) H->R Cls.: BCL (w=180/256) iou-smooth L1 [-90,0)->[-90,90) dota1.0, dota1.5, dota2.0
GWD R3Det 71.56 63.22 49.25 Baidu Drive (jb6e) H->R Reg. (βˆ†β¬) smooth L1->gwd [-90,0) dota1.0, dota1.5, dota2.0
BCD R3Det 72.22 63.53 49.71 Baidu Drive (v60g) H->R Reg. (βˆ†β¬) bcd [-90,0) dota1.0, dota1.5, dota2.0
KLD R3Det 71.73 65.18 50.90 Baidu Drive (tq7f) H->R Reg. (βˆ†β¬) kld [-90,0) dota1.0, dota1.5, dota2.0
KFIoU R3Det 72.28 64.69 50.41 Baidu Drive (u77v) H->R Reg. (βˆ†β¬) kfiou [-90,0) dota1.0, dota1.5, dota2.0
- R2CNN (Faster-RCNN) 72.27 66.45 52.35 Baidu Drive (02s5) H->R Reg. (βˆ†β¬) smooth L1 [-90,0) dota1.0, dota1.5 dota2.0

SOTA

Method Backbone DOTA1.0 Model MS Data Augmentation Epoch Configs
R2CNN-BCD ResNet152_v1d-FPN 79.54 Baidu Drive (h2u1) √ √ 34 dota1.0
RetinaNet-BCD ResNet152_v1d-FPN 78.52 Baidu Drive (0puk) √ √ 51 dota1.0
R3Det-BCD ResNet50_v1d-FPN 79.08 Baidu Drive (v60g) √ √ 51 dota1.0
R3Det-BCD ResNet152_v1d-FPN 79.95 Baidu Drive (v60g) √ √ 51 dota1.0

Note:

  • Single GPU training: SAVE_WEIGHTS_INTE = iter_epoch * 1 (DOTA1.0: iter_epoch=27000, DOTA1.5: iter_epoch=32000, DOTA2.0: iter_epoch=40000)
  • Multi-GPU training (better): SAVE_WEIGHTS_INTE = iter_epoch * 2

Installation

Manual configuration (cuda version < 11)

pip install -r requirements.txt
pip install -v -e .  # or "python setup.py develop"

Or, you can simply install AlphaRotate with the following command:

pip install alpharotate  # Not suitable for dev.

Docker (cuda version < 11)

docker images: yangxue2docker/yx-tf-det:tensorflow1.13.1-cuda10-gpu-py3

Note: For 30xx series graphics cards (cuda version >= 11), I recommend this blog to install tf1.xx, or download image from tensorflow-release-notes according to your development environment, e.g. nvcr.io/nvidia/tensorflow:20.11-tf1-py3

cd alpharotate/libs/utils/cython_utils
rm *.so
rm *.c
rm *.cpp
python setup.py build_ext --inplace (or make)

cd alpharotate/libs/utils/
rm *.so
rm *.c
rm *.cpp
python setup.py build_ext --inplace

Download Model

Pretrain weights

Download a pretrain weight you need from the following three options, and then put it to $PATH_ROOT/dataloader/pretrained_weights.

  1. MxNet pretrain weights (recommend in this repo, default in NET_NAME): resnet_v1d, resnet_v1b, refer to gluon2TF.
  1. Tensorflow pretrain weights: resnet50_v1, resnet101_v1, resnet152_v1, efficientnet, mobilenet_v2, darknet53 (Baidu Drive (1jg2), Google Drive).
  2. Pytorch pretrain weights, refer to pretrain_zoo.py and Others.

Trained weights

  1. Please download trained models by this project, then put them to $PATH_ROOT/output/pretained_weights.

Train

  1. If you want to train your own dataset, please note:

    (1) Select the detector and dataset you want to use, and mark them as #DETECTOR and #DATASET (such as #DETECTOR=retinanet and #DATASET=DOTA)
    (2) Modify parameters (such as CLASS_NUM, DATASET_NAME, VERSION, etc.) in $PATH_ROO./configs/#DATASET/#DETECTOR/cfgs_xxx.py
    (3) Copy $PATH_ROO./configs/#DATASET/#DETECTOR/cfgs_xxx.py to $PATH_ROO./configs/cfgs.py
    (4) Add category information in $PATH_ROOT/libs/label_name_dict/label_dict.py     
    (5) Add data_name to $PATH_ROOT/dataloader/dataset/read_tfrecord.py  
    
  2. Make tfrecord
    If image is very large (such as DOTA dataset), the image needs to be cropped. Take DOTA dataset as a example:

    cd $PATH_ROOT/dataloader/dataset/DOTA
    python data_crop.py
    

    If image does not need to be cropped, just convert the annotation file into xml format, refer to example.xml.

    cd $PATH_ROOT/dataloader/dataset/  
    python convert_data_to_tfrecord.py --root_dir='/PATH/TO/DOTA/' 
                                       --xml_dir='labeltxt'
                                       --image_dir='images'
                                       --save_name='train' 
                                       --img_format='.png' 
                                       --dataset='DOTA'
    
  3. Start training

    cd $PATH_ROOT/tools/#DETECTOR
    python train.py
    

Test

  1. For large-scale image, take DOTA dataset as a example (the output file or visualization is in $PATH_ROOT/tools/#DETECTOR/test_dota/VERSION):

    cd $PATH_ROOT/tools/#DETECTOR
    python test_dota.py --test_dir='/PATH/TO/IMAGES/'  
                        --gpus=0,1,2,3,4,5,6,7  
                        -ms (multi-scale testing, optional)
                        -s (visualization, optional)
                        -cn (use cpu nms, slightly better <1% than gpu nms but slower, optional)
    
    or (recommend in this repo, better than multi-scale testing)
    
    python test_dota_sota.py --test_dir='/PATH/TO/IMAGES/'  
                             --gpus=0,1,2,3,4,5,6,7  
                             -s (visualization, optional)
                             -cn (use cpu nms, slightly better <1% than gpu nms but slower, optional)
    

    Notice: In order to set the breakpoint conveniently, the read and write mode of the file is' a+'. If the model of the same #VERSION needs to be tested again, the original test results need to be deleted.

  2. For small-scale image, take HRSC2016 dataset as a example:

    cd $PATH_ROOT/tools/#DETECTOR
    python test_hrsc2016.py --test_dir='/PATH/TO/IMAGES/'  
                            --gpu=0
                            --image_ext='bmp'
                            --test_annotation_path='/PATH/TO/ANNOTATIONS'
                            -s (visualization, optional)
    

Tensorboard

cd $PATH_ROOT/output/summary
tensorboard --logdir=.

1

2

Citation

If you find our code useful for your research, please consider cite.

@article{yang2021alpharotate,
    author  = {Yang, Xue and Zhou, Yue and Yan, Junchi},
    title   = {AlphaRotate: A Rotation Detection Benchmark using TensorFlow},
    journal = {arXiv preprint arXiv:2111.06677},
    year    = {2021},
}

Reference

1、https://github.com/endernewton/tf-faster-rcnn
2、https://github.com/zengarden/light_head_rcnn
3、https://github.com/tensorflow/models/tree/master/research/object_detection
4、https://github.com/fizyr/keras-retinanet