zerollzeng/tensorrt-zoo

How to run openpose other models demo on tensorrt-zoo

MolianWH opened this issue · 8 comments

Hi, I want test other openpose models, such as face, hand. I specified the openpose face model path, but it haven't generated engine file.

env

  • OS: ubuntu18.04
  • platform: vscode
  • CUDA: 10.1
  • CUDNN: 7.6.4.38
  • Nvidia driver: 440.44
  • GPU: GeForce GTX 1070, 8G memory

Inputs

# in launch.json
...
"args": [
                "testopenpose",
                "--prototxt",
                "./bin/openpose_models/face/pose_deploy.prototxt",
                "--caffemodel",
                "./bin/openpose_models/face/pose_iter_116000.caffemodel",
                "--save_engine",
                "./bin/openpose_engine/face/openpose.engine",
                "--input",
                "/home/dreamdeck/Documents/code/test/tensorrt-zoo/bin/input/COCO_val2014_000000000241.jpg",
                "--run_mode",
                "0"
            ],
...

Details

I reset input_dim in pose_deploy.prototxt also 480 and 640. Here is it.

input: "image"
input_dim: 1
input_dim: 3
input_dim: 480 # Original: 368
input_dim: 640 # Original: 368
...

Errors

It report Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output. and [error] read create engine file . I found it didn't generate engine file.
Here is the output report.

usage: path/to/testopenpose --prototxt path/to/prototxt --caffemodel path/to/caffemodel/ --save_engine path/to/save_engin --input path/to/input/img --run_mode 0/1/2
[2020-05-08 13:48:04.354] [info] create plugin factory
[2020-05-08 13:48:04.354] [info] yolo3 params: class: 1, netSize: 416 
[2020-05-08 13:48:04.354] [info] upsample params: scale: 2
[2020-05-08 13:48:04.354] [info] prototxt: ./bin/openpose_models/face/pose_deploy.prototxt
[2020-05-08 13:48:04.354] [info] caffeModel: ./bin/openpose_models/face/pose_iter_116000.caffemodel
[2020-05-08 13:48:04.354] [info] engineFile: ./bin/openpose_engine/face/openpose.engine
[2020-05-08 13:48:04.354] [info] outputBlobName: 
net_output 
[2020-05-08 13:48:04.354] [info] build caffe engine with ./bin/openpose_models/face/pose_deploy.prototxt and ./bin/openpose_models/face/pose_iter_116000.caffemodel
[2020-05-08 13:48:04.689] [info] Number of network layers: 106
[2020-05-08 13:48:04.689] [info] Number of input: 
Input layer: 
image : 3x480x640 
[2020-05-08 13:48:04.689] [info] Number of output: 1
Output layer: 
net_output : 71x60x80 
[2020-05-08 13:48:04.689] [info] parse network done
[2020-05-08 13:48:04.689] [info] fp16 support: false
[2020-05-08 13:48:04.689] [info] int8 support: true
[2020-05-08 13:48:04.689] [info] Max batchsize: 1
[2020-05-08 13:48:04.689] [info] Max workspace size: 10485760
[2020-05-08 13:48:04.689] [info] Number of DLA core: 0
[2020-05-08 13:48:04.689] [info] Max DLA batchsize: 268435456
[2020-05-08 13:48:04.689] [info] Current use DLA core: 0
[2020-05-08 13:48:04.689] [info] build engine...
Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
Detected 1 inputs and 1 output network tensors.
[2020-05-08 13:48:39.917] [info] serialize engine to ./bin/openpose_engine/face/openpose.engine
[2020-05-08 13:48:39.917] [info] save engine to ./bin/openpose_engine/face/openpose.engine...
[2020-05-08 13:48:43.591] [error] read create engine file ./bin/openpose_engine/face/openpose.engine failed
[2020-05-08 13:48:43.592] [info] create execute context and malloc device memory...
[2020-05-08 13:48:43.592] [info] init engine...
[2020-05-08 13:48:43.593] [info] malloc device memory
nbBingdings: 2
[2020-05-08 13:48:43.593] [info] input: 
[2020-05-08 13:48:43.593] [info] binding bindIndex: 0, name: image, size in byte: 3686400
[2020-05-08 13:48:43.593] [info] binding dims with 3 dimemsion
3 x 480 x 640   
[2020-05-08 13:48:43.595] [info] output: 
[2020-05-08 13:48:43.595] [info] binding bindIndex: 1, name: net_output, size in byte: 1363200
[2020-05-08 13:48:43.595] [info] binding dims with 3 dimemsion
71 x 60 x 80   
=====>malloc extra memory for openpose...
heatmap Dims3
heatmap size: 1 71 60 80
allocate heatmap host and divice memory done
resize map size: 1 71 240 320
kernel size: 1 71 240 320
allocate kernel host and device memory done
peaks size: 1 25 128 3
allocate peaks host and device memory done
=====> malloc extra memory done
[2020-05-08 13:48:43.688] [info] net forward takes 91.2896 ms
inference Time : 94.946 ms
[1] + Done                       "/usr/bin/gdb" --interpreter=mi --tty=${DbgTerm} 0<"/tmp/Microsoft-MIEngine-In-2zjhhvwr.c77" 1>"/tmp/Microsoft-MIEngine-Out-gg24uk5d.g89"

Questions

  1. Should I change the dims in prototxt file and what is based on to change them?
  2. Is the error related to GPU memory?

Thanks!

It report Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.

Don't worry about this

[error] read create engine file

https://github.com/zerollzeng/tiny-tensorrt/blob/eb0d3a7a7ed7e8894f8b897cef6bada64e736331/Trt.cpp#L218

And last, the openpose I test in tensorrt-zoo was only the body keypoint model. I'm not sure if other model has the same post-processing phase.

Thanks. I debug and found the error that I didn't create folder. The engine file is created successfully. But there is no points output on result.jpg

Try check the post-processing code

Do I need to rewrite DoInference function and BodyPartConnector.cu file? I found poseKeypoints variable is null. connectBodyPartsCpu function seems write for body 25.

Do I need to rewrite DoInference function and BodyPartConnector.cu file? I found poseKeypoints variable is null. connectBodyPartsCpu function seems write for body 25.

Yes, I'm afraid so. but this might requires you modify all post-processing include bodypartconnetct.cu, posenms.cu, etc. the way I create the openpose example is I borrow the post-processing code from openpose source code, make a little modification so it can manipulate with tensorrt output. you can take a shot.

Thanks. I will have a look.
And I test other models engine, likes coco_18, net forward takes 127.127ms, and other openpose models also take 100+ms, except body 25 that you friendly provide takes 53.9651 ms.
Have you tested coco_18? Does tiny-tensorrt support it?

I haven't test coco_18 model, but for the inference speed, take a loot at https://github.com/CMU-Perceptual-Computing-Lab/openpose/blob/master/doc/faq.md#difference-between-body_25-vs-coco-vs-mpi for reference

Close due to inativity