/PointPillars_MultiHead_40FPS

A REAL-TIME 3D detection network [Pointpillars] compiled by CUDA/TensorRT/C++.

Primary LanguageC++GNU General Public License v3.0GPL-3.0

English | 简体中文

PointPillars

High performance version of 3D object detection network -PointPillars, which can achieve the real-time processing (less than 1 ms / head)

  1. The inference part of PointPillars(pfe , backbone(multihead)) is optimized by tensorrt
  2. The pre- and post- processing are optimized by CUDA / C + recode.

Major Advance

Requirements (My Environment)

For *.onnx and *.trt engine file

  • Linux Ubuntu 18.04
  • OpenPCdet
  • ONNX IR version: 0.0.6
  • onnx2trt

For algorithm:

  • Linux Ubuntu 18.04
  • CMake 3.17
  • CUDA 10.2
  • TensorRT 7.1.3
  • yaml-cpp
  • google-test (not necessary)

For visualization

Usage

  1. clone thest two repositories, and make sure the dependences is complete

    mkdir workspace && cd workspace
    git clone https://github.com/hova88/PointPillars_MultiHead_40FPS.git --recursive && cd ..
    git clone https://github.com/hova88/OpenPCDet.git 
  2. generate engine file

    • 1.1 Pytorch model --> ONNX model : The specific conversion tutorial, i have put in the change log of hova88/OpenPCdet.

    • 1.2 ONNX model --> TensorRT model : after install the onnx2trt, things become very simple. Note that if you want to further improve the the inference speed, you must use half precision or mixed precision(like ,-d 16)

          onnx2trt cbgs_pp_multihead_pfe.onnx -o cbgs_pp_multihead_pfe.trt -b 1 -d 16 
          onnx2trt cbgs_pp_multihead_backbone.onnx -o cbgs_pp_multihead_backbone.trt -b 1 -d 16 
    • 1.3 engine file --> algorithm : Specified the path of engine files(*.onnx , *.trt) inbootstrap.yaml.

    • 1.4 Download the test pointcloud nuscenes_10sweeps_points.txt, and specified the path in bootstrap.yaml.

  3. Compiler

    cd PointPillars_MultiHead_40FPS
    mkdir build && cd build
    cmake .. && make -j8 && ./test/test_model
  4. Visualization

    cd PointPillars_MultiHead_40FPS/tools
    python viewer.py

Left figure shows the results of this repo, Right figure shows the official result of mmlab/OpenPCdet.

fig_method

Result

Use *.trt engine file on NVIDIA GeForce RTX 3080 Ti

with the ScoreThreshold = 0.1

 | ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄> 
 | ../model/cbgs_pp_multihead_pfe.trt >
 |_____________________> 
             (\__/) ||                 
             (•ㅅ•) ||                 
             /   づ                                                         
                                                                  
 | ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄> 
 | ../model/cbgs_pp_multihead_backbone.trt >
 |_____________________> 
             (\__/) ||                 ****
             (•ㅅ•) ||                 
             /   づ     
                                                                  
------------------------------------
Module        Time        
------------------------------------
Preprocess    0.571069 ms
Pfe           3.26203  ms
Scatter       0.384075 ms
Backbone      2.92882  ms
Postprocess   8.82032  ms
Summary       15.9707  ms
------------------------------------

with the ScoreThreshold = 0.4

 | ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄> 
 | ../model/cbgs_pp_multihead_pfe.trt >
 |_____________________> 
             (\__/) ||                 
             (•ㅅ•) ||                 
             /   づ                                                         
                                                                  
 | ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄ ̄> 
 | ../model/cbgs_pp_multihead_backbone.trt >
 |_____________________> 
             (\__/) ||                 ****
             (•ㅅ•) ||                 
             /   づ     
                                                                  
------------------------------------
Module        Time        
------------------------------------
Preprocess    0.337111 ms
Pfe           2.81834  ms
Scatter       0.161953 ms
Backbone      3.64112  ms
Postprocess   4.34731  ms
Summary       11.3101  ms
------------------------------------

Runtime logs

License

GNU General Public License v3.0 or later See COPYING to see the full text.