This is the official implementation of the paper ""DSGN++: Exploiting Visual-Spatial Relation for Stereo-based 3D Detectors"" to jointly estimate scene depth and detect 3D objects in 3D world. With input of binocular image pair, our model achieves over 70+ AP on the KITTI val dataset.
DSGN++: Exploiting Visual-Spatial Relation for Stereo-based 3D Detectors
Authors: Yilun Chen, Shijia Huang, Shu Liu, Bei Yu, Jiaya Jia
- 7/2022: We released the first vision-based model that achieved 70+ AP on the KITTI val set.
(1) Download the KITTI 3D object detection dataset including velodyne, stereo images, calibration matrices, and the road plane. The folders are organized as follows:
ROOT_PATH
├── data
│ ├── kitti
│ │ │── ImageSets
│ │ │── training
│ │ │ ├──calib & velodyne & label_2 & image_2 & image_3 & (optional: planes)
│ │ │── testing
│ │ │ ├──calib & velodyne & image_2 & image_3
├── pcdet
├── mmdetection-v2.22.0
(2) Generate KITTI data list and joint Stereo-Lidar Copy-Paste database for training.
python -m pcdet.datasets.kitti.lidar_kitti_dataset create_kitti_infos
python -m pcdet.datasets.kitti.lidar_kitti_dataset create_gt_database_only --image_crops
(1) Clone this repository.
git clone https://github.com/chenyilun95/DSGN2
cd DSGN2
(2) Install mmcv-1.4.0 library.
pip install pycocotools==2.0.2
pip install torch==1.7.1 torchvision==0.8.2
pip install mmcv-full==1.4.0 -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.7.1/index.html
(2) Install mmdetection-v2.22.0 inside the this .
cd mmdetection-v2.22.0
pip install -e .
(3) Install the pcdet library.
pip install -e .
Train the model by
python -m torch.distributed.launch --nproc_per_node=4 tools/train.py \
--launcher pytorch \
--fix_random_seed \
--workers 2 \
--sync_bn \
--save_to_file \
--cfg_file ./configs/stereo/kitti_models/dsgn2.yaml \
--tcp_port 12345
Evaluating the model by
python -m torch.distributed.launch --nproc_per_node=4 tools/test.py \
--launcher pytorch \
--workers 2 \
--save_to_file \
--cfg_file ./configs/stereo/kitti_models/dsgn2.yaml \
--exp_name default \
--tcp_port 12345 \
--ckpt_id 60
The evaluation results can be found in the model folder.
We provide the pretrained models of DSGN2 evaluated on the KITTI val set.
Methods | Car | Ped. | Cyc. | Models |
---|---|---|---|---|
DSGN++ | 70.05 | 39.42 | 44.47 | GoogleDrive |
- STILL In Progress
If you find our work useful in your research, please consider citing:
@ARTICLE{chen2022dsgn++,
title={DSGN++: Exploiting Visual-Spatial Relation for Stereo-Based 3D Detectors},
author={Chen, Yilun and Huang, Shijia and Liu, Shu and Yu, Bei and Jia, Jiaya},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
year={2022}
}
Our code is based on several released code repositories. We thank the great code from LIGA-Stereo, OpenPCDet, mmdetection.
If you get troubles or suggestions for this repository, please feel free to contact me (chenyilun95@gmail.com).