Table of Contents
I used to study
Deblurring
andVideo Super-Resolution
, but I am very interested inmodel deployment
. I found that there are very few deployment cases forlow-level vision
tasks, which may be due to the low demand for such tasks and the loss of accuracy.
After learningTensorRT
andNCNN
, which are excellent inference frameworks, I decided to make deployment cases oflow-level vision
and open source it.
This repo will be dedicated to providing deployment cases in the low-level-vision
field. Including using inference frameworks such as TensorRT
and NCNN
to deploy tasks such as Deblurring
, Image Super-resolution
, Video super-resolution
, Image Denoising
.
The repo will also provide a series of tutorials such as TensorRT
custom operators, using API to build a network, multiple input and multiple output, as well as performance testing and bottleneck analysis of the engine generated by TensorRT
.
For the installation of TensorRT
, the docker file of mmdeploy is used. NCNN
can be installed directly in the container mentioned above.
This is an example of how to list things you need to use the software and how to install them.
-
Docker
-
TensorRT
-
NCNN
-
-
Git clone MMdeploy.
git clone -b master https://github.com/open-mmlab/mmdeploy.git MMDeploy
-
Build docker image(GPU).
cd mmdeploy docker build docker/GPU/ -t mmdeploy:master-gpu
-
Run docker container
docker run --gpus all -it mmdeploy:master-gpu
-
Install NCNN
apt install build-essential git cmake libprotobuf-dev protobuf-compiler libvulkan-dev vulkan-utils libopencv-dev git clone https://github.com/Tencent/ncnn.git cd ncnn git submodule update --init wget https://sdk.lunarg.com/sdk/download/1.2.189.0/linux/vulkansdk-linux-x86_64-1.2.189.0.tar.gz?Human=true -O vulkansdk-linux-x86_64-1.2.189.0.tar.gz tar -xf vulkansdk-linux-x86_64-1.2.189.0.tar.gz export VULKAN_SDK=$(pwd)/1.2.189.0/x86_64
-
Compile NCNN
mkdir -p build cd build cmake -DNCNN_VULKAN=ON .. make -j4 make install
-
Demo of TensorRT-Spynet
git clone https://github.com/niehen6174/LVMD.git cd TensorRT/Spynet mkdir build cd build cmake .. make ./spynet -s ./spynet -d
Use this space to show useful examples of how a project can be used.
Demo of TensorRT-Spynet
-
Git clone repo
git clone https://github.com/niehen6174/LVMD.git
-
Ready to compile
cd TensorRT/Spynet mkdir build cd build
-
Download wts file
wget https://xsj-niehen.oss-cn-hangzhou.aliyuncs.com/lvmd/Spynet.wts
-
Compile
cmake .. make
-
Generating serialization model
./spynet -s
-
Inference
./spynet -d
-
Testing the serialization model takes time
trtexec --loadEngine=./addplugin.engine --plugins=./libFlowWarp.so --shapes=ref:3x32x32,supp:3x32x32 --verbose > result.log -- result.lgo [02/13/2023-02:07:47] [I] Host Latency [02/13/2023-02:07:47] [I] min: 1.02942 ms (end to end 1.271 ms) [02/13/2023-02:07:47] [I] max: 5.34741 ms (end to end 5.45886 ms) [02/13/2023-02:07:47] [I] mean: 1.21322 ms (end to end 1.40523 ms) [02/13/2023-02:07:47] [I] median: 1.18549 ms (end to end 1.29443 ms) [02/13/2023-02:07:47] [I] percentile: 1.32043 ms at 99% (end to end 2.52673 ms at 99%) [02/13/2023-02:07:47] [I] throughput: 0 qps [02/13/2023-02:07:47] [I] walltime: 2.44427 s [02/13/2023-02:07:47] [I] Enqueue Time [02/13/2023-02:07:47] [I] min: 1.11456 ms [02/13/2023-02:07:47] [I] max: 5.32129 ms [02/13/2023-02:07:47] [I] median: 1.13934 ms [02/13/2023-02:07:47] [I] GPU Compute [02/13/2023-02:07:47] [I] min: 1.01123 ms [02/13/2023-02:07:47] [I] max: 5.33582 ms [02/13/2023-02:07:47] [I] mean: 1.19868 ms [02/13/2023-02:07:47] [I] median: 1.17108 ms [02/13/2023-02:07:47] [I] percentile: 1.29785 ms at 99% [02/13/2023-02:07:47] [I] total compute time: 2.44051 s
-
Viewing each layer of the model takes time
nsys profile --force-overwrite=true --stats=true -o model-OnlyRun ./spynet -d -- output NVTX Push-Pop Range Statistics: Time(%) Total Time (ns) Instances Average Minimum Maximum Range ------- --------------- --------- ------------ ---------- ---------- -------------------------------------------------------------------------------- 50.0 1284566211 1 1284566211.0 1284566211 1284566211 TensorRT:ExecutionContext::enqueue 49.9 1280651003 1 1280651003.0 1280651003 1280651003 TensorRT:(Unnamed Layer* 19) [Convolution] + (Unnamed Layer* 20) [Activation] 0.0 228912 1 228912.0 228912 228912 TensorRT:(Unnamed Layer* 21) [Convolution] + (Unnamed Layer* 22) [Activation] 0.0 221764 1 221764.0 221764 221764 TensorRT:(Unnamed Layer* 79) [Convolution] + (Unnamed Layer* 80) [Activation] 0.0 206198 1 206198.0 206198 206198 TensorRT:ExecutionContext::recompute 0.0 181466 1 181466.0 181466 181466 TensorRT:(Unnamed Layer* 97) [Convolution] + (Unnamed Layer* 98) [ElementWise] 0.0 153100 1 153100.0 153100 153100 TensorRT:(Unnamed Layer* 35) [Convolution] + (Unnamed Layer* 36) [Activation] 0.0 129186 1 129186.0 129186 129186 TensorRT:(Unnamed Layer* 81) [Convolution] + (Unnamed Layer* 82) [Activation] 0.0 118147 1 118147.0 118147 118147 TensorRT:(Unnamed Layer* 23) [Convolution] + ---eta
Analysis of NVIDIA Nsight Systems.
- TensorRT-DeblurGAN
- TensorRT-Real-EsrGAN
- TensorRT-Spynet
- TensorRT-Basicvsr
- TensorRT-flow_warp Plgin
- TensoRT-Basicvsr backbone
- TensoRT-Basicvsr Triton
- NCNN-DeblurGAN
- NCNN-Real-EsrGAn
See the open issues for a full list of proposed features (and known issues).
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!
Distributed under the MIT License. See LICENSE.txt
for more information.
Niehen6174 - email@niehen6174@qq.com
Project Link: LVMD