This repo contains the code for our paper:
Efficient Spatial-Temporal Information Fusion for LiDAR-Based 3D Moving Object Segmentation.
Jiadai Sun, Yuchao Dai, Xianjing Zhang, Jintao Xu, Rui Ai, Weihao Gu, and Xieyuanli Chen
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2022
# Ubuntu 18.04 and above is recommended.
conda env create -f environment.yaml
conda activate mos3d
# Install SoftPool follow https://github.com/alexandrosstergiou/SoftPool
git clone https://github.com/alexandrosstergiou/SoftPool.git
cd SoftPool-master/pytorch
make install
--- (optional) ---
make test
# Install TorchSparse follow https://github.com/mit-han-lab/torchsparse
sudo apt install libsparsehash-dev
pip install --upgrade git+https://github.com/mit-han-lab/torchsparse.git@v1.4.0
Download the toy-dataset and pretrained weights, and unzip them to project_path. You can also use gdown to download them in command line.
[Download command for toy-dataset and checkpoints (click to expand)]
gdown --id 1t8OuDgFzUspWtYVHSfiGkXtGrBsuvtWL # for toy-data
unzip toydata.zip
mkdir log && cd log
gdown --id 199hRJBs-3MVgqrd4Tb08Eo5pjBG74cSX # for checkpoints
unzip ckpt_motionseg3d_pointrefine.zip
Then you could use the follow command to inference and visualize the predictions. If you use toy dataset, please modify the seq_id
corresponding to valid
in model_path/data_cfg.yaml
.
# To inference the predictions.
python infer.py -d ./toydata -m ./log/motionseg3d_pointrefine -l ./pred/oursv1 -s valid
python infer.py -d ./toydata -m ./log/motionseg3d_pointrefine -l ./pred/oursv2 -s valid --pointrefine
# Visualize the predictions.
python utils/visualize_mos.py -d ./toydata -p ./pred/oursv2 --offset 0 -s 38
- Download KITTI Odometry Benchmark Velodyne point clouds (80 GB) from here.
- Download KITTI Odometry Benchmark calibration data (1 MB) from here.
- Download SemanticKITTI label data (179 MB) (alternatively the data in Files corresponds to the same data) from here.
- Download KITTI-Road Velodyne point clouds from original website, more details can be found in config/kitti_road_mos.md
- Download the KITTI-Road-MOS label data annotated by us, the pose and calib files from here (6.1 MB) .
- Extract everything into the same folder, as follow:
[Expected directory structure of SemanticKITTI (click to expand)]
DATAROOT
├── sequences
│ └── 08
│ ├── calib.txt # calibration file provided by KITTI
│ ├── poses.txt # ground truth poses file provided by KITTI
│ ├── velodyne # velodyne 64 LiDAR scans provided by KITTI
│ │ ├── 000000.bin
│ │ ├── 000001.bin
│ │ └── ...
│ ├── labels # ground truth labels provided by SemantiKITTI
│ │ ├── 000000.label
│ │ ├── 000001.label
│ │ └── ...
│ └── residual_images_1 # the proposed residual images
│ ├── 000000.npy
│ ├── 000001.npy
│ └── ...
- Next run the data preparation script (based on LMNet) to generate the residual images. More parameters about the data preparation can be found in the yaml file config/data_preparing.yaml.
python utils/auto_gen_residual_images.py
The newly labeled KITTI-Road-MOS data is divided into train/valid set.
The useage of data can be controlled by specifying--data_config
in training. During inference, if you use toy dataset or do not download the KITTI-Road-MOS, please modify theseq_id
corresponding tovalid
inmodel_path/data_cfg.yaml
.
# validation split
python infer.py -d DATAROOT -m ./log/model_path/logs/TIMESTAMP/ -l ./predictions/ -s valid
# test split
python infer.py -d DATAROOT -m ./log/model_path/logs/TIMESTAMP/ -l ./predictions/ -s test
The predictions/labels will be saved to ./predictions/
.
# Only on seq08
python utils/evaluate_mos.py -d DATAROOT -p ./predictions/ --datacfg config/labels/semantic-kitti-mos.raw.yaml
# On seq08 + road-validation-split
python utils/evaluate_mos.py -d DATAROOT -p ./predictions/ --datacfg config/labels/semantic-kitti-mos.yaml
The training is seperated into two phases, and switching between phases is currently manually controlled.
--data_config
determines whether to use new label data KITTI-Road-MOS, such as -dc config/labels/semantic-kitti-mos.yaml
or -dc config/labels/semantic-kitti-mos.raw.yaml
- Phase 1 (multi-gpu): Only the range image is used for input and supervision. The training log and checkpoint will be stored in
./log/ours_motionseg3d/logs/TIMESTAMP/
.
export CUDA_VISIBLE_DEVICES=0,1,2,3
python train.py -d DATAROOT -ac ./train_yaml/mos_coarse_stage.yml -l log/ours_motionseg3d
- Phase 2 (single gpu): After the first phase of training, use the following command to start the second phase of training for the PointRefine module.
export CUDA_VISIBLE_DEVICES=0
python train_2stage.py -d DATAROOT -ac ./train_yaml/mos_pointrefine_stage.yml -l log/ours_motionseg3d_pointrefine -p "./log/ours_motionseg3d/logs/TIMESTAMP/"
If you find this code useful for your research, please use the following BibTeX entry.
@inproceedings{sun2022mos3d,
title={Efficient Spatial-Temporal Information Fusion for LiDAR-Based 3D Moving Object Segmentation},
author={Sun, Jiadai and Dai, Yuchao and Zhang, Xianjing and Xu, Jintao and Ai, Rui and Gu, Weihao and Chen, Xieyuanli},
booktitle={IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
year={2022},
organization={IEEE}
}
We would like to thank Yufei Wang and Mochu Xiang for their insightful and effective discussions.
Some of the code in this repo is borrowed from LMNet and spvnas.
Copyright 2022, Jiadai Sun, Xieyuanli Chen, Xianjing Zhang, HAOMO.AI Technology Co., Ltd., China.
This project is free software made available under the GPL v3.0 License. For details see the LICENSE file.