/FSTR

Fully Sparse Transformer 3D Detector for LiDAR Point Cloud

Primary LanguagePythonOtherNOASSERTION

Fully Sparse Transformer 3D Detector for LiDAR Point Cloud

Paper, nuScenes LeaderBoard

All statistics are measured on a single Tesla A100 GPU using the best model of official repositories. Some sparse module in the model are supported.


FSTR is a fully sparse LiDAR-based detector that achieves better accuracy-efficient trade-off compare with other popular LiDAR-based detectors. A lightweight DETR-like framework with signle decoder layer is designed for lidar-only detection, which obtains 73.6% NDS (FSTR-XLarge with TTA) on nuScenes benchmark and 31.5% CDS (FSTR-Large) on Argoverse2 validation dataset.

Currently Supported Features

  • Support nuScenes dataset
  • Support Argoverse2 dataset

Preparation

  • Environments
    Python == 3.8
    CUDA == 11.1
    pytorch == 1.9.0
    mmcv-full == 1.6.0
    mmdet == 2.24.0
    mmsegmentation == 0.29.1
    mmdet3d == 1.0.0rc5
    flash-attn == 0.2.2
    Spconv-plus == 2.1.21

  • Data
    Follow the mmdet3d to process the nuScenes dataset.

Train & inference

# train
bash tools/dist_train.sh /path_to_your_config 8
# inference
bash tools/dist_test.sh /path_to_your_config /path_to_your_pth 8 --eval bbox

Main Results

Results on nuScenes val set. The default batch size is 2 on each GPU. The FPS are all evaluated with a single Tesla A100 GPU. (15e + 5e means the last 5 epochs should be trained without GTsample)

Config mAP NDS Schedule Inference FPS
FSTR 64.2% 69.1% 15e+5e 15.4
FSTR-Large 65.5% 70.3% 15e+5e 9.5

Results on nuScenes test set. To reproduce our result, replace ann_file=data_root + '/nuscenes_infos_train.pkl' in training config with ann_file=[data_root + '/nuscenes_infos_train.pkl', data_root + '/nuscenes_infos_val.pkl']:

Config mAP NDS Schedule Inference FPS
FSTR 66.2% 70.4% 15e+5e 15.4
FSTR +TTA 67.6% 71.5% 15e+5e -
FSTR-Large + TTA 69.5% 73.0% 15e+5e -
FSTR-XLarge + TTA 70.2% 73.5% 15e+5e -

Citation

If you find our FSTR helpful in your research, please consider citing:

@article{zhang2023fully,
  title={Fully Sparse Transformer 3D Detector for LiDAR Point Cloud},
  author={Zhang, Diankun and Zheng, Zhijie and Niu, Haoyu and Wang, Xueqing and Liu, Xiaojun},
  journal={IEEE Transactions on Geoscience and Remote Sensing},
  year={2023},
  publisher={IEEE}
}

Contact

If you have any questions, feel free to open an issue or contact us at zhangdiankun19@mails.ucas.edu.cn, or tanfeiyang@megvii.com.

Acknowledgement

Parts of our Code refer to the the recent work CMT.