MonoDTR

MonoDTR: Monocular 3D Object Detection with Depth-Aware Transformer (CVPR 2022) [paper]
Kuan-Chih Huang, Tsung-Han Wu, Hung-Ting Su, Winston H. Hsu.

Update

The code for the KITTI-360 dataset is now available in the kitti360 branch, and the results can be viewed on the KITTI-360 leaderboard.

Setup

Please refer to INSTALL.md for installation and to DATA.md for data preparation.

Train

Move to root and train the network with $EXP_NAME:

 cd #MonoDTR_ROOT
 ./launcher/train.sh config/config.py 0 $EXP_NAME

Note: this repo only supports single GPU training. Also, the training randomness for monocular 3D object detection may cause the variance of ±1 AP3D.

Eval

To evaluate on the validation set using checkpoint $CHECKPOINT_PATH:

 ./launcher/eval.sh config/config.py 0 $CHECKPOINT_PATH validation

We provide a good checkpoint for the car category on train/val split here.

Citation

If you find our work useful in your research, please consider citing:

@inproceedings{huang2022monodtr,
    author = {Kuan-Chih Huang and Tsung-Han Wu and Hung-Ting Su and Winston H. Hsu},
    title = {MonoDTR: Monocular 3D Object Detection with Depth-Aware Transformer},
    booktitle = {CVPR},
    year = {2022}    
}

Acknowlegment

Our codes are mainly based on visualDet3D, and also benefits from CaDDN, MonoDLE, and LoFTR. Thanks for their contributions!

License

This project is released under the MIT License.

KuanchihHuang/MonoDTR