This project is modified based on the AIInfer, Tanks for this project.
This is a c++ version of the AI reasoning library. Currently, it only supports the reasoning of the tensorrt model. The follow-up plan supports the c++ reasoning of frameworks such as Openvino, NCNN, and MNN. There are two versions for pre- and post-processing, c++ version and cuda version. It is recommended to use the cuda version., This repository provides accelerated deployment cases of deep learning CV popular models, and cuda c supports dynamic-batch image process, infer, decode, NMS.
- 2023.05.27 update yolov5、yolov7、yolov8、yolox
- 2023.05.28 update rt_detr
- 2023.06.01 update yolov8_seg、yolov8_pose
- 2023.06.09 update yolov7_cutoff
- 2023.06.14 update yolov7-pose
- 2023.06.15 Adding Producer-Consumer Inference Model for yolov8-det
- 2023.06.24 update 3D objection detection algorithm smoke
- 2023.09.06 update deploy for detr in mmdetection
- 2024.01.26 update yolov8-obb
- 2024.02.06 update depth-anything
- 2024.02.12 update yolop & yolopv2
- 2024.02.26 update yolov9
The following environments have been tested:
- ubuntu16.04
- cuda11.1
- cudnn8.6.0
- TensorRT-8.5.1.7
- gcc5.4.0
- cmake-3.24.0
- opencv-4.5.5
- Eigen3
- yaml
You can also use docker, How to use it is as follows:
docker pull longxiaowyh/dl_model_infer:v1.0
nvidia-docker run -itu root:root --name dl_model_infer --gpus all -v /your_path:/target_path -v /tmp/.X11-unix/:/tmp/.X11-unix/ -e DISPLAY=unix$DISPLAY -e GDK_SCALE -e GDK_DPI_SCALE -e NVIDIA_VISIBLE_DEVICES=all -e NVIDIA_DRIVER_CAPABILITIES=compute,utility --shm-size=64g longxiaowyh/dl_model_infer:v1.0 /bin/bash
- RT-DETR model export tutorial
- Yolov8 model export tutorial
- Yolov5 model export tutorial
- Yolov7 model export tutorial
- yolov7_cutoff model export tutorial
- yolov7-pose model export tutorial
- smoke model export tutorial
- DETR model export tutorial
- Put the workspaces/detr_pytorch2onnx.py file under the mmdetection path.
- Modify the config_file and checkpoint_file paths in the detr_pytorch2onnx.py file.
- Use the detr_pytorch2onnx.py file to generate onnx file.
- Use trtexec to generate engine files.
- DepthAnything model export tutorial
- YOLOP model export tutorial
- Refer to the workspace/yolop_model_compile.sh file
- Yolov9 model export tutorial
- cpm.hpp Producer-consumer model
- For direct inference tasks, cpm.hpp can be turned into an automatic multi-batch producer-consumer model
cpm::Instance<BoxArray, Image, yolov8_detector> cpmi;
auto result_futures = cpmi.commits(yoloimages);
for (int ib = 0; ib < result_futures.size(); ++ib)
{
auto objs = result_futures[ib].get();
auto image = images[ib].clone();
for (auto& obj : objs)
{
process....
}
}
Take yolov8 target detection as an example,modify CMakeLists.txt and run the command below:
git clone git@github.com:yhwang-hub/dl_model_infer.git
cd dl_model_infer
mkdir build && cd build
cmake .. && make
cd ../workspaces
./infer -f yolov8n.transd.trt -i res/dog.jpg -b 10 -c 10 -o cuda_res -t yolov8_det
You can also use a script to execute,The above instructions are written in the compile_and_run.sh script,for example:
rm -rf build && mkdir -p build && cd build && cmake .. && make -j9 && cd ..
# mkdir -p build && cd build && cmake .. && make -j48 && cd ..
cd workspaces
rm -rf cuda_res/*
# ./infer -f yolov8n.transd.trt -i res/dog.jpg -b 10 -c 10 -o cuda_res -t yolov8_det
cd ..
Then execute the following command to run
cd dl_model_infer
bash compile_and_run.sh
AiInfer
|--application # Implementation of model inference application, your own model inference can be implemented in this directory
|--yolov8_det_app # Example: A yolov8 detection implemented
|--xxxx
|--utils # tools directory
|--backend # here implements the reasoning class of backend
|--common # There are some commonly used tools in it
|--arg_parsing.h # Command line parsing class, similar to python's argparse
|--cuda_utils.h # There are some common tool functions of cuda in it
|--cv_cpp_utils.h # There are some cv-related utility functions in it
|--memory.h # Tools related to cpu and gpu memory application and release
|--model_info.h # Commonly used parameter definitions for pre- and post-processing of the model, such as mean variance, nms threshold, etc.
|--utils.h # Commonly used tool functions in cpp, timing, mkdir, etc.
|--cpm.h # Producer-Consumer Inference Model
|--post_process # Post-processing implementation directory, cuda post-processing acceleration, if you have custom post-processing, you can also write it here
|--pre_process # pre-processing implementation directory, cuda pre-processing acceleration, if you have custom pre-processing can also be written here
|--tracker # This is the implementation of the target detection and tracking library, which has been decoupled and can be deleted directly if you don’t want to use it
|--workspaces # Working directory, where you can put some test pictures/videos, models, and then directly use the relative path in main.cpp
|--mains # This is the collection of main.cpp, where each app corresponds to a main file, which is easy to understand, and it is too redundant to write together
|--main.cpp # Project entry
Tested on Jetson Orin, the test includes the entire process (image preprocessing + model inference + post-processing decoding)、
Model | Precision | Resolution | FPS(bs=1) |
---|---|---|---|
rtdetr_r50 | FP16 | 640x640 | 19 |
yolov8n | FP16 | 640x640 | 126 |
yolov8n-seg | FP16 | 640x640 | 92 |
yolov8s-pose | FP16 | 640x640 | 58 |
yolov8s-obb | FP16 | 1024x1024 | 38 |
yolov5s | FP16 | 640x640 | 92 |
yolov7 | FP16 | 640x640 | 34 |
yolov7_cutoff | FP16 | 640x640 | 32 |
yolov7-w6-pose | FP16 | 960x960 | 22 |
yolox_s | FP16 | 640x640 | 91 |
detr | FP16 | 800x1190 | 18 |
depth_anything_vits14 | FP16 | 518x518 | 19 |
yolop | FP16 | 640x640 | 40 |
yolopv2 | FP16 | 480x640 | 26 |
Thanks for the following items