VMamba+MMRotate+DOTA-v1.0旋转目标检测

写在前面

本项目仅为中文区的CVers提供快速实装VMamba视觉表征模型的参考例。

本项目以代码差分方式提交,请务必参考:

本项目基于2024/04/04版本的VMamba代码编写。

本项目组织方式:

  • 把"VMamba/classification/models/"文件夹作为"mmrotate/models/backbones/vmamba_models/"文件夹
  • 把"VMamba/detection/model.py"文件作为"mmrotate/models/backbones/vmamba_model.py"文件并修改"__init__.py"
  • 制作vmamba的config文件

mmrotate-0.3.3/0.3.4/dev-1.x安装

【说明】我们推荐使用mmrotate-0.3.3/0.3.4版本,它是一个较为稳定的版本。mmrotate-dev-1.x版本是基于mmcv-2与mmdet-3编写的未来主流版本,但它的多尺度测试可能存在一些问题。

# 受到 https://github.com/state-spaces/mamba 要求:PyTorch 1.12+ CUDA 11.6+
wget https://developer.download.nvidia.com/compute/cuda/11.6.2/local_installers/cuda_11.6.2_510.47.03_linux.run
chmod +x ./cuda_11.6.2_510.47.03_linux.run
sudo ./cuda_11.6.2_510.47.03_linux.run
# cuda 11.6对应cudnn 8.4.0
# tar -xf cudnn-linux-x86_64-8.4.1.50_cuda11.6-archive.tar.xz
# sudo cp cudnn-linux-x86_64-8.4.1.50_cuda11.6-archive/include/* /usr/local/cuda-11.6/include/
# sudo cp cudnn-linux-x86_64-8.4.1.50_cuda11.6-archive/lib/* /usr/local/cuda-11.6/lib64/
# 
# vi ~/.bashrc
# Add CUDA path
export PATH=/usr/local/cuda-11.6/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-11.6/lib64:$LD_LIBRARY_PATH
export NCCL_P2P_DISABLE="1"
export NCCL_IB_DISABLE="1"
# 
source ~/.bashrc
nvcc -V
# 
# NO sudo when install anaconda
# chmod +x ./Anaconda3-2023.09-0-Linux-x86_64.sh
# ./Anaconda3-2023.09-0-Linux-x86_64.sh
# 
# 此处存疑,我自己使用的是pytorch==1.12.1,有报告称高版本PyTorch可能会对Swin Transformer有性能增益具体不明
# conda create -n openmmlab1131 python=3.8 -y
# conda activate openmmlab1131
# # ref: https://pytorch.org/get-started/previous-versions/#v1131
# conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.6 -c pytorch -c nvidia
conda create -n openmmlab1121 python=3.8 -y
conda activate openmmlab1121
# ref: https://pytorch.org/get-started/previous-versions/#v1121
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.6 -c pytorch -c conda-forge
pip install shapely tqdm timm
# 
# if mmrotate-0.3.3/0.3.4
pip install openmim
mim install mmcv-full==1.6.1
mim install mmdet==2.25.1
git clone https://github.com/open-mmlab/mmrotate.git
cd mmrotate
pip install -r requirements/build.txt
pip install -v -e .
# 
# 降低部分包的版本
pip install numpy==1.21.5
pip install yapf==0.40.1
# 
# # if mmrotate-dev-1.x
# pip install -U openmim
# mim install mmengine
# # 受到 dev-1.x 要求,可以安装 mmcv==2.0.0rc2 和 mmdet==3.0.0rc6 之后的那个版本
# mim install mmcv==2.0.1
# mim install mmdet==3.1.0
# # 我们所使用mmrotate-dev-1.x版本的提交码是fd60beff130a54e284a73651903de29fe728f97b,请注意核对
# git clone https://github.com/open-mmlab/mmrotate.git -b dev-1.x
# cd mmrotate
# pip install -r requirements/build.txt
# pip install -v -e .
# 
# 安装必要的vmamba依赖
pip install einops fvcore triton
cd kernels/selective_scan && pip install .

DOTA数据集创建

使用官网下载的数据集解压创建

# 请先准备好与mmrotate主目录并列的mmrotate-data和mmrotate-tools文件夹
# 以下命令均在mmrotate主目录下执行
# 
# ln -s /Workspace/Dataset/DOTA/ ../mmrotate-data/data/
# 
ln -s ../mmrotate-data/data/ ./
ln -s ../mmrotate-data/work_dirs/ ./
# 
python ../mmrotate-tools/md5_calc.py --path ./data/DOTA/train.tar.gz
# cfb5007ada913241e02c24484e12d5d2
python ../mmrotate-tools/md5_calc.py --path ./data/DOTA/val.tar.gz
# a53e74b0d69dacf3ffcb438accd60c45
python ../mmrotate-tools/md5_calc.py --path ./data/DOTA/test/part1.zip
# d3028e48da64b37ad2f2f5f31059e0da
python ../mmrotate-tools/md5_calc.py --path ./data/DOTA/test/part2.zip
# 99f779850cc44b8f8b28d348494c6b41
# 
tar -xzf ./data/DOTA/train.tar.gz -C ./data/DOTA/
tar -xzf ./data/DOTA/val.tar.gz -C ./data/DOTA/
unzip ./data/DOTA/test/part1.zip -d ./data/DOTA/test/
unzip ./data/DOTA/test/part2.zip -d ./data/DOTA/test/
# 
python ../mmrotate-tools/dir_list.py --path ./data/DOTA/train/images/ --output ./data/DOTA/train/trainset.txt
# 1411
python ../mmrotate-tools/dir_list.py --path ./data/DOTA/val/images/ --output ./data/DOTA/val/valset.txt
# 458
python ../mmrotate-tools/dir_list.py --path ./data/DOTA/test/images/ --output ./data/DOTA/test/testset.txt
# 937

mmrotate分割处理

python tools/data/dota/split/img_split.py --base-json tools/data/dota/split/split_configs/ss_train.json
# Total images number: 15749
python tools/data/dota/split/img_split.py --base-json tools/data/dota/split/split_configs/ss_val.json
# Total images number: 5297
python tools/data/dota/split/img_split.py --base-json tools/data/dota/split/split_configs/ss_trainval.json
# Total images number: 21046
python tools/data/dota/split/img_split.py --base-json tools/data/dota/split/split_configs/ss_test.json
# Total images number: 10833
python tools/data/dota/split/img_split.py --base-json tools/data/dota/split/split_configs/ms_trainval.json
# Total images number: 138883
python tools/data/dota/split/img_split.py --base-json tools/data/dota/split/split_configs/ms_test.json
# Total images number: 71888

预训练模型目录为./data/pretrained/

DOTA_devkit安装

sudo apt install swig
swig -c++ -python polyiou.i
python setup.py build_ext --inplace

DOTA数据集训练与合并测试

多卡训练:

# if mmrotate-0.3.3/0.3.4
CUDA_VISIBLE_DEVICES=0,1 ./tools/dist_train.sh ./configs/_rotated_faster_rcnn_/rotated_faster_rcnn_r50_fpn_1x_dota_le90.py 2
CUDA_VISIBLE_DEVICES=0,1 nohup ./tools/dist_train.sh ./configs/_rotated_faster_rcnn_/rotated_faster_rcnn_r50_fpn_1x_dota_le90.py 2 > nohup.log 2>&1 &
# 
# if mmrotate-dev-1.x
CUDA_VISIBLE_DEVICES=0,1 ./tools/dist_train.sh ./configs/_rotated_faster_rcnn_/rotated-faster-rcnn-le90_r50_fpn_1x_dota.py 2
CUDA_VISIBLE_DEVICES=0,1 nohup ./tools/dist_train.sh ./configs/_rotated_faster_rcnn_/rotated-faster-rcnn-le90_r50_fpn_1x_dota.py 2 > nohup.log 2>&1 &

多卡合并测试:

# if mmrotate-0.3.3/0.3.4
CUDA_VISIBLE_DEVICES=0,1 ./tools/dist_test.sh ./configs/_rotated_faster_rcnn_/rotated_faster_rcnn_r50_fpn_1x_dota_le90.py ./work_dirs/rotated_faster_rcnn_r50_fpn_1x_dota_le90/rotated_faster_rcnn_r50_fpn_1x_dota_le90-0393aa5c.pth 2 --format-only --eval-options submission_dir="./work_dirs/Task1_r50_033"
python "../DOTA_devkit-master/dota_evaluation_task1.py" --mergedir "./work_dirs/Task1_r50_033/" --imagesetdir "./data/DOTA/val/" --use_07_metric True
# map: 0.820117064577964
# 
# if mmrotate-dev-1.x
CUDA_VISIBLE_DEVICES=0,1 ./tools/dist_test.sh ./configs/_rotated_faster_rcnn_/rotated-faster-rcnn-le90_r50_fpn_1x_dota.py ./work_dirs/rotated_faster_rcnn_r50_fpn_1x_dota_le90/rotated_faster_rcnn_r50_fpn_1x_dota_le90-0393aa5c.pth 2
python "../DOTA_devkit-master/dota_evaluation_task1.py" --mergedir "./work_dirs/Task1_rotated-faster-rcnn-le90_r50_fpn_1x_dota/" --imagesetdir "./data/DOTA/val/" --use_07_metric True
# map: 0.8193743727960783

Params&FLOPs计算:

python ./tools/analysis_tools/get_flops.py ./configs/rotated_faster_rcnn/rotated_faster_rcnn_r50_fpn_1x_dota_le90.py

Performance

所有的训练和测试均在4×A100卡上进行。

  • 表中split mAP是对ss-val的评测;merge mAP是对ss-val或ms-test的评测;
  • 表中VMamba的FLOPs暂无计算。
Detector backbone_size batch_size init_lr
×e-4
split mAP merge mAP Training
Cost
Testing
FPS
Params FLOPs Configs
Rotated
RetinaNet
(1x ss)
Swin-T 4*4 1 67.14 68.68 0.7h 106.6 37.13M 222.08G cfg
4*4 2 66.91 68.72
Swin-S 4*4 1 67.54 69.66 1.9h 61.1 58.45M 314.82G cfg
4*4 2 68.22 70.13
Swin-B 4*4 1 68.48 70.56 2.7h 45.3 97.06M 461.50G cfg
4*4 2 68.60 70.65
VHeat-T 4*4 1 69.08 71.41 0.7h 101.7 ? ? -
4*4 2 69.91 71.81
VHeat-S - - - - - - - - -
- - - -
VHeat-B 4*4 1 68.43 71.68 2.0h 63.0 ? ? -
4*4 2 69.42 72.02
VMamba-T 4*4 1 68.99 71.28 0.9h 94.4 ? ? cfg
4*4 2 70.74 71.56
VMamba-S 4*4 1 68.48 71.45 3.0h 51.3 ? ? cfg
4*4 2 70.50 71.80
VMamba-B 4*4 1 69.15 71.69 3.8h 38.5 ? ? cfg
4*4 2 69.99 72.03
Rotated
Faster RCNN
(1x ss)
Swin-T 4*4 1 70.11 72.62 0.7h 106.1 44.76M 215.54G cfg
4*4 2 70.55 73.34
Swin-S 4*4 1 70.39 73.22 1.9h 58.7 66.08M 308.28G cfg
4*4 2 72.23 73.77
Swin-B 4*4 1 71.73 73.91 2.7h 44.1 104.11M 455.35G cfg
4*4 2 73.16 74.41
VHeat-T 4*4 1 72.38 74.05 0.8h 98.9 ? ? -
4*4 2 72.07 74.42
VHeat-S - - - - - - - - -
- - - -
VHeat-B 4*4 1 72.48 74.33 2.1h 59.6 ? ? -
4*4 2 72.70 74.82
VMamba-T 4*4 1 73.72 74.75 1.0h 90.0 ? ? cfg
4*4 2 74.91 75.31
VMamba-S 4*4 1 73.26 74.60 3.0h 48.6 ? ? cfg
4*4 2 73.05 75.55
VMamba-B 4*4 1 73.43 74.16 3.8h 37.3 ? ? cfg
4*4 2 74.27 75.63
Oriented
RCNN
(1x ss)
Swin-T 4*4 1 73.88 75.92 0.8h 105.0 44.76M 215.68G cfg
4*4 2 74.30 75.84
Swin-S 4*4 1 74.49 76.07 2.0h 58.7 66.08M 308.42G cfg
4*4 2 74.33 76.26
Swin-B 4*4 1 74.88 76.16 2.8h 44.1 104.11M 455.49G cfg
4*4 2 74.41 76.38
VHeat-T 4*4 1 74.35 76.22 0.8h 94.8 ? ? -
4*4 2 74.88 76.78
VHeat-S - - - - - - - - -
- - - -
VHeat-B 4*4 1 75.27 76.36 2.2h 57.6 ? ? -
4*4 2 75.23 77.35
VMamba-T 4*4 1 76.24 76.51 1.0h 86.6 ? ? cfg
4*4 2 76.81 77.01
VMamba-S 4*4 1 76.40 76.32 3.1h 47.8 ? ? cfg
4*4 2 76.00 77.32
VMamba-B 4*4 1 76.19 76.17 3.8h 36.5 ? ? cfg
4*4 2 76.25 77.61
Oriented
RCNN
(1x msrr)
Swin-T 4*4 2 87.77 81.25 4.6h cfg
Swin-S 4*4 2 89.11 81.14 12.5h cfg
Swin-B 4*4 2 89.12 81.26 17.5h cfg
VHeat-T 4*4 2 88.91 81.53 4.9h -
VHeat-S - - - - - -
VHeat-B 4*4 2 90.09 81.17 14.0h -
VMamba-T 4*4 2 89.06 81.20 6.2h cfg
VMamba-S 4*4 2 89.68 79.81 20.2h cfg
VMamba-B 4*4 2 89.86 80.25 26.0h cfg

写在后面

更新代码好麻烦呜呜呜


Copyright (c) 2024 Marina Akitsuki. All rights reserved.

Date modified: 2024/04/19