The Pytorch Implementation of Real-time Semantic Segmentation via Spatial-detail Guided Context Propagation
The Camvid dataset contains 701 images and 32 different semantic categories. We follow the previous works, and evaluate our segmentation model using 11 categories. For the details, please refer to the papers [1], [2], [3]. The Camvid dataset can be downloaded from http://mi.eng.cam.ac.uk/research/projects/VideoRec/CamVid/.
The Cityscapes dataset contains 25 000 road scene images, in which 5000 images are finely annotated and 20 000 images are labelled with coarse annotation. In our experiments, we only adopt the fine-annotated subset. The fine-annotated subset involves 30 semantic categories. But we follow the previous works [2], [3], [4], [5], and adopt 19 categories in model evaluation. The Cityscapes dataset can be downloaded from http://mi.eng.cam.ac.uk/research/projects/VideoRec/CamVid/. Accordingly, the toolkits for data pre-processing can be found at https://github.com/mcordts/cityscapesScripts.
[1] Combining Appearance and Structure from Motion Features for Road Scene Understanding
[2] ICNet for real-time semantic segmentation on high-resolution images
[3] Efficient dense modules of asymmetric convolution for real-time semantic segmentation
[4] Pyramid scene parsing network
[5] PSANet: Point-wise spatial attention network for scene parsing
python==3.6.2
pytorch==1.1
numpy==1.15
torchvision==0.3.0
pillow==7.1.2
cython==0.29.20
scipy==1.1.0
scikit-learn==0.16.2
(1). Before running the scripts, please run python src/setup.py build_ext --build-lib=./src/
.
(2). For training the segmentation model, please run the command: python src/train.py --evaluate [False/True]
.
(3). We have saved the well-trained model on the Cityscapes dataset in Google Drive for reproducing our results. Specifically, firstly, put the download checkpoint into the folder ./ckpt
. Secondly, run the command python src/test.py
. The predictions for the test images are saved into the folder ./ckpt/test_result
. Thirdly, please run the command zip -r test_result.zip test_result
in bash. Finally, submit the file test_result.zip
to the official online evaluator (https://www.cityscapes-dataset.com/submit/) to get the final performance in the Cityscapes' test set.
The results should be around the following:
Class level
Metric | Average | Road | Sidewalk | Building | Wall | Fence | Pole | Trafficilight | Trafficsign | Vegetation | Terrain | Sky | Person | Rider | Car | Truck | Bus | Train | Motorcycle | Bicycle |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
IoU (%) | 70.9003 | 98.0907 | 82.6953 | 90.8128 | 46.2016 | 48.8514 | 56.3521 | 61.3421 | 68.3546 | 92.0528 | 69.1465 | 94.5957 | 79.5957 | 61.4092 | 93.6822 | 53.2762 | 69.9236 | 60.5087 | 53.1659 | 67.0485 |
iIoU (%) | 43.4520 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a | 56.2162 | 35.7356 | 85.4904 | 22.7908 | 38.0693 | 29.0138 | 29.1411 | 51.1591 |
Category level
Metric | Average | Flat | Nature | Object | Sky | Construction | Human | Vehicle |
---|---|---|---|---|---|---|---|---|
IoU (%) | 87.3518 | 98.3475 | 91.8220 | 63.0371 | 94.5957 | 91.0540 | 79.8839 | 92.7222 |
iIoU (%) | 70.3811 | n/a | n/a | n/a | n/a | n/a | 57.425 | 83.371 |
If our work is helpful for your research, please consider citing our paper:
@article{hao2022real,
title={Real-Time Semantic Segmentation via Spatial-Detail Guided Context Propagation},
author={Hao, Shijie and Zhou, Yuan and Guo, Yanrong and Hong, Richang and Cheng, Jun and Wang, Meng},
journal={IEEE Transactions on Neural Networks and Learning Systems},
year={2022},
publisher={IEEE}
}
Very thanks for the great works DeepLab, Condense, HyperSeg, Mobilenet, OpCounter, and TensorRT.