This is the unofficial code of Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes. which achieve state-of-the-art trade-off between accuracy and speed on cityscapes and camvid, without using inference acceleration and extra data!on single 2080Ti GPU, DDRNet-23-slim yields 77.4% mIoU at 109 FPS on Cityscapes test set and 74.4% mIoU at 230 FPS on CamVid test set.
The code mainly borrows from HRNet-Semantic-Segmentation OCR and the official repository, thanks for their work.
Here I list the software and hardware used in my experiment
- pytorch==1.7.0
- 3080*2
- cuda==11.1
You need to download the Cityscapesdatasets. and rename the folder cityscapes
, then put the data under data
folder.
└── data
├── cityscapes
└── list
download the pretrained model on imagenet or the segmentation model from the official,and put the files in ${PROJECT}/pretrained_models
folder
use the official pretrained model. with ydhongHIT's advice now can reach the same accuracy in the paper. Thanks.
- For DDRNET23-Slim
cd ${PROJECT}
python tools/demo_detect.py --cfg experiments/cityscapes/ddrnet23_slim.yaml --source /path/to/video.mp4 --show True
- For DDRNet23
cd ${PROJECT}
python tools/demo_detect.py --cfg experiments/cityscapes/ddrnet23.yaml --source /path/to/video.mp4 --show True
model | Train Set | Test Set | OHEM | Multi-scale | Flip | mIoU | Link |
---|---|---|---|---|---|---|---|
DDRNet23_slim | unknown | eval | Yes | No | No | 77.83 | official |
DDRNet23_slim | unknown | eval | Yes | No | Yes | 78.42 | official |
DDRNet23 | unknown | eval | Yes | No | No | 79.51 | official |
DDRNet23 | unknown | eval | Yes | No | Yes | 79.98 | official |
Note
- The
***.yaml
files of DDRNet should have Image size of (1024, 1024) or (512, 512). - with the
ALIGN_CORNERS: false
in***.yaml
will reach higher accuracy.
download the imagenet pretrained model, and then train the model with 2 nvidia-3080
cd ${PROJECT}
python -m torch.distributed.launch --nproc_per_node=2 tools/train.py --cfg experiments/cityscapes/ddrnet23_slim.yaml
the own trained model coming soon
model | Train Set | Test Set | OHEM | Multi-scale | Flip | mIoU | Link |
---|---|---|---|---|---|---|---|
DDRNet23_slim | train | eval | Yes | No | Yes | 77.77 | Baidu/password:it2s |
DDRNet23_slim | train | eval | Yes | Yes | Yes | 79.57 | Baidu/password:it2s |
DDRNet23 | train | eval | Yes | No | Yes | ~ | None |
DDRNet39 | train | eval | Yes | No | Yes | ~ | None |
Note
- set the
ALIGN_CORNERS: true
in***.yaml
, because i use the default setting in HRNet-Semantic-Segmentation OCR. - Multi-scale with scales: 0.5,0.75,1.0,1.25,1.5,1.75. it runs too slow.
- from ydhongHIT, can change the
align_corners=True
with better performance, the default option isFalse