PVO: Panoptic Visual Odometry
[Weicai Ye, Xinyue Lan]Co-Authors, Shuo Chen, Yuhang Ming, Xinyuan Yu, Hujun Bao, Zhaopeng Cui, Guofeng Zhang
CVPR 2023
-
prepare. follow prepare.md
-
generate inital panoptic segmentation.
sh tools/initial_segmentation.sh
- vps->vo,vo Module generate pose, flow and depth.
sh tools/test_vo_scene.sh
- vo->vps, vps Module use flow and depth from vo Module and generate final video panoptic segmentation results and vpq.
sh tools/test_vps.sh
Scene | RMSE | vpq_all/vpq_thing/vpq_stuff |
---|---|---|
Scene01 | 0.371 | 40.39/26.43/44.57 |
Scene02 | 0.058 | 68.84/88.83/62.18 |
Scene06 | 0.113 | 66.38/79.99/62.97 |
Scene18 | 0.951 | 68.35/83.86/63.92 |
Scene20 | 3.503 | 35.11/16.83/40.59 |
You can get the results in the paper by iterating multiple times.
- To train VPS_Module, you can refer to Detectron2 for more training details.
Here for example, you can train vkitti 15-deg-left on 4 GPUs, and training results are saved on
output/vps_training/
. You can modify the config-file according to the hardware conditions.
python -W ignore VPS_Module/tools/train_net.py \
--config-file VPS_Module/configs/COCO-PanopticSegmentation/panoptic_fpn_R_50_3x_vkitti_511.yaml --num-gpu 4 \
MODEL.WEIGHTS checkpoints/panFPN.pth \
OUTPUT_DIR output/vps_training/
And you can use tools/train_net.sh script instead that command
- To train VO_Module, you can refer to DROID-SLAM for more training details. Here for example, you can train vkitti on 4 GPUs.
python VO_Module/train.py --gpus=4 --lr=0.00025
You can refer to DROID-SLAM for visualization. All demos can be run on a GPU with 11G of memory. While running, press the "s" key to increase the filtering threshold (= more points) and "a" to decrease the filtering threshold (= fewer points).
python VO_Module/evaluation_scripts/test_vo.py --datapath=datasets/Virtual_KITTI2/Scene01 --segm_filter True
If you find this code useful for your research, please use the following BibTeX entry.
@inproceedings{Ye2023PVO,
title={{PVO: Panoptic visual odometry}},
author={Ye, Weicai and Lan, Xinyue and Chen, Shuo and Ming, Yuhang and Yu, Xingyuan and Bao, Hujun and Cui, Zhaopeng and Zhang, Guofeng},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={9579--9589},
year={2023}
}
Some code snippets are borrowed from DROID-SLAM and Detectron2. Thanks for these great projects.