This repository provides an inplementation of our paper RGB Road Scene Material Segmentation in ACCV2022. If you use our code and data please cite our paper.
Please note that this is research software and may contain bugs or other issues – please use it at your own risk. If you experience major problems with it, you may contact us, but please note that we do not have the resources to deal with all issues.
@InProceedings{Cai_2022_ACCV,
author = {Sudong Cai and Ryosuke Wakaki and Shohei Nobuhara and Ko Nishino},
title = {RGB Road Scene Material Segmentation},
booktitle = {Proceedings of Asian Conference on Computer Vision (ACCV)},
month = {Dec},
year = {2022}
}
The implementation of the encoder Mix-Transformer (MiT) and the corresponding ImageNet pre-trained model are adopted from the original project: SegFormer.
The dataloader, utils, files for training and evaluation are adopted/modified from the related github project: pytorch-deeplab-xception.
The KITTI-Materials dataset is provided under the Creative Commons Attribution 4.0 International License (CC BY 4.0).
Our KITTI-Materials dataset is available as data.zip
at: Google Drive.
Uncompress the zip to extract files into data/KITTI_Materials/
.
Please note that the list_folder1
and list_folder2
in the dataset folder denote the split-1
and split-2
respectively.
The pretrained rmsnet weights for KITTI-Materials can be found as rmsnet_split1.pth
and rmsnet_split2.pth
at: Google Drive.
The ImageNet pre-trained weight for MiT-B2 encoder mit_b2.pth
can be found at: mit_b2 .
We tested our code with Python 3.8 on Ubuntu 18.04 LTS using the following packages.
Other recent versions may also be okay in general, but not sure.
You can also use env.def
to create a singularity container with these packages.
- pytorch==1.11.0
- torchvision==0.12.0
- opencv_contrib_python==4.5.2.54
- tqdm==4.62.3
- einops==0.5.0
- timm==0.6.11
- matplotlib==3.6.2
- tensorboardx==2.5.1
- pillow==9.0.1
Put the dataset and pretrained weights as follows.
- Uncompress 'KITTI_Materials' dataset into
data/
(i.e., the data path is expected to bedata/KITTI_Materials/
). - Put the pretrained weight
mit_b2.pth
intoweights/init/
. - Put the pretrained weights for our RMSNet into
weights/rmsnet/
(only for evaluation, unnecessary if training from scratch).
As a result, you should have the following directory structure.
.
├── LICENSE
├── README.md
├── data
│ └── KITTI_Materials
│ ├── kitti_advanced_classes_weights.npy
│ ├── list_folder1/
│ ├── list_folder2/
│ └── train/
├── test.py
├── train.py
└── weights
├── init
│ └── mit_b2.pth
├── rmsnet
│ ├── rmsnet_split1.pth
│ └── rmsnet_split2.pth
└── save_path/
Run test.py
with trained weights. It should output
Validation:
[Epoch: 0, numImages: 200]
Acc:0.8500841899671052, Acc_class:0.6338839179588864, mIoU:0.46823551505456135, fwIoU: 0.7583153107992938
Loss: 120.971
Run train.py
to train RMSNet from scratch. It requires a GPU with 40GB or more RAM. The output should be as follows (the mIoU may fluctuate by around 1%).
=>Epoches 298, learning rate = 0.0000, previous best = 0.4667
Train loss: 0.149: 100%|███████████████████████████████████████████████████████████████████████████| 66/66 [01:41<00:00, 1.54s/it]
[Epoch: 298, numImages: 792]
Loss: 9.852
save path: weights/save_path/
use the ImageNet pre-trained model
Test loss: 1.065: 100%|████████████████████████████████████████████████████████████████████████████| 10/10 [00:14<00:00, 1.43s/it]
Validation:
[Epoch: 298, numImages: 200]
Acc:0.84499755859375, Acc_class:0.6205695550166175, mIoU:0.45809623406927835, fwIoU: 0.7553211684835024
Loss: 10.653
Number of images in train: 800
Number of images in val: 200
0%| | 0/66 [00:00<?, ?it/s]
=>Epoches 299, learning rate = 0.0000, previous best = 0.4667
Train loss: 0.145: 100%|███████████████████████████████████████████████████████████████████████████| 66/66 [01:41<00:00, 1.54s/it]
[Epoch: 299, numImages: 792]
Loss: 9.592
save path: weights/save_path/
use the ImageNet pre-trained model
Test loss: 1.083: 100%|████████████████████████████████████████████████████████████████████████████| 10/10 [00:18<00:00, 1.82s/it]
Validation:
[Epoch: 299, numImages: 200]
Acc:0.8449763826069079, Acc_class:0.6195685580715145, mIoU:0.45762377309155483, fwIoU: 0.7553741009645986
Loss: 10.831
Note that, if you want to train with your customized settings, please directly change the corresponding hyperparameters (e.g., learning rate, epochs, Sync BN, and etc.) in the train.py
, instead of using argparse from outside.