/YOLOR

implementation of paper - You Only Learn One Representation: Unified Network for Multiple Tasks (https://arxiv.org/abs/2105.04206)

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

YOLOR

implementation of paper - You Only Learn One Representation: Unified Network for Multiple Tasks

PWC

Unified Network

To get the results on the table, please use this branch.

Model Test Size APtest AP50test AP75test batch1 throughput batch32 inference
YOLOR-P6 1280 54.1% 71.8% 59.3% 76 fps 8.3 ms
YOLOR-W6 1280 55.5% 73.2% 60.6% 66 fps 10.7 ms
YOLOR-E6 1280 56.4% 74.1% 61.6% 45 fps 17.1 ms
YOLOR-D6 1280 57.3% 75.0% 62.7% 34 fps 21.8 ms
YOLOR-D6* 1280 57.8% 75.5% 63.3% 34 fps 21.8 ms
YOLOv4-P5 896 51.8% 70.3% 56.6% 41 fps (old) -
YOLOv4-P6 1280 54.5% 72.6% 59.8% 30 fps (old) -
YOLOv4-P7 1536 55.5% 73.4% 60.8% 16 fps (old) -
  • Fix the speed bottleneck on our NFS, many thanks to NCHC, TWCC, and NARLabs support teams.

To reproduce the inference speed, please see darknet.

Model Test Size APval AP50val AP75val APSval APMval APLval batch1 throughput
YOLOv4-CSP 640 49.1% 67.7% 53.8% 32.1% 54.4% 63.2% 76 fps
YOLOR-CSP 640 49.2% 67.6% 53.7% 32.9% 54.4% 63.0% weights
YOLOR-CSP* 640 50.0% 68.7% 54.3% 34.2% 55.1% 64.3% weights
YOLOv4-CSP-X 640 50.9% 69.3% 55.4% 35.3% 55.8% 64.8% 53 fps
YOLOR-CSP-X 640 51.1% 69.6% 55.7% 35.7% 56.0% 65.2% weights
YOLOR-CSP-X* 640 51.5% 69.9% 56.1% 35.8% 56.8% 66.1% weights

Developing...

Model Test Size APtest AP50test AP75test APStest APMtest APLtest
YOLOR-CSP 640 51.1% 69.6% 55.7% 31.7% 55.3% 64.7%
YOLOR-CSP-X 640 53.0% 71.4% 57.9% 33.7% 57.1% 66.8%

Train from scratch for 300 epochs...

Model Info Test Size AP
YOLOR-CSP evolution 640 48.0%
YOLOR-CSP strategy 640 50.0%
YOLOR-CSP strategy + simOTA 640 51.1%
YOLOR-CSP-X strategy 640 51.5%
YOLOR-CSP-X strategy + simOTA 640 53.0%

Installation

Docker environment (recommended)

Expand
# create the docker container, you can change the share memory size if you have more.
nvidia-docker run --name yolor -it -v your_coco_path/:/coco/ -v your_code_path/:/yolor --shm-size=64g nvcr.io/nvidia/pytorch:20.11-py3

# apt install required packages
apt update
apt install -y zip htop screen libgl1-mesa-glx

# pip install required packages
pip install seaborn thop

# install mish-cuda if you want to use mish activation
# https://github.com/thomasbrandon/mish-cuda
# https://github.com/JunnYu/mish-cuda
cd /
git clone https://github.com/JunnYu/mish-cuda
cd mish-cuda
python setup.py build install

# install pytorch_wavelets if you want to use dwt down-sampling module
# https://github.com/fbcotter/pytorch_wavelets
cd /
git clone https://github.com/fbcotter/pytorch_wavelets
cd pytorch_wavelets
pip install .

# go to code folder
cd /yolor

Colab environment

Expand
git clone https://github.com/WongKinYiu/yolor
cd yolor

# pip install required packages
pip install -qr requirements.txt

# install mish-cuda if you want to use mish activation
# https://github.com/thomasbrandon/mish-cuda
# https://github.com/JunnYu/mish-cuda
git clone https://github.com/JunnYu/mish-cuda
cd mish-cuda
python setup.py build install
cd ..

# install pytorch_wavelets if you want to use dwt down-sampling module
# https://github.com/fbcotter/pytorch_wavelets
git clone https://github.com/fbcotter/pytorch_wavelets
cd pytorch_wavelets
pip install .
cd ..

Prepare COCO dataset

Expand
cd /yolor
bash scripts/get_coco.sh

Prepare pretrained weight

Expand
cd /yolor
bash scripts/get_pretrain.sh

Testing

yolor_p6.pt

python test.py --data data/coco.yaml --img 1280 --batch 32 --conf 0.001 --iou 0.65 --device 0 --cfg cfg/yolor_p6.cfg --weights yolor_p6.pt --name yolor_p6_val

You will get the results:

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.52510
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.70718
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.57520
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.37058
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.56878
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.66102
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.39181
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.65229
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.71441
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.57755
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.75337
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.84013

Training

Single GPU training:

python train.py --batch-size 8 --img 1280 1280 --data coco.yaml --cfg cfg/yolor_p6.cfg --weights '' --device 0 --name yolor_p6 --hyp hyp.scratch.1280.yaml --epochs 300

Multiple GPU training:

python -m torch.distributed.launch --nproc_per_node 2 --master_port 9527 train.py --batch-size 16 --img 1280 1280 --data coco.yaml --cfg cfg/yolor_p6.cfg --weights '' --device 0,1 --sync-bn --name yolor_p6 --hyp hyp.scratch.1280.yaml --epochs 300

Training schedule in the paper:

python -m torch.distributed.launch --nproc_per_node 8 --master_port 9527 train.py --batch-size 64 --img 1280 1280 --data data/coco.yaml --cfg cfg/yolor_p6.cfg --weights '' --device 0,1,2,3,4,5,6,7 --sync-bn --name yolor_p6 --hyp hyp.scratch.1280.yaml --epochs 300
python -m torch.distributed.launch --nproc_per_node 8 --master_port 9527 tune.py --batch-size 64 --img 1280 1280 --data data/coco.yaml --cfg cfg/yolor_p6.cfg --weights 'runs/train/yolor_p6/weights/last_298.pt' --device 0,1,2,3,4,5,6,7 --sync-bn --name yolor_p6-tune --hyp hyp.finetune.1280.yaml --epochs 450
python -m torch.distributed.launch --nproc_per_node 8 --master_port 9527 train.py --batch-size 64 --img 1280 1280 --data data/coco.yaml --cfg cfg/yolor_p6.cfg --weights 'runs/train/yolor_p6-tune/weights/epoch_424.pt' --device 0,1,2,3,4,5,6,7 --sync-bn --name yolor_p6-fine --hyp hyp.finetune.1280.yaml --epochs 450

Inference

yolor_p6.pt

python detect.py --source inference/images/horses.jpg --cfg cfg/yolor_p6.cfg --weights yolor_p6.pt --conf 0.25 --img-size 1280 --device 0

You will get the results:

horses

Citation

@article{wang2021you,
  title={You Only Learn One Representation: Unified Network for Multiple Tasks},
  author={Wang, Chien-Yao and Yeh, I-Hau and Liao, Hong-Yuan Mark},
  journal={arXiv preprint arXiv:2105.04206},
  year={2021}
}

Acknowledgements

Expand