How to run openpose other models demo on tensorrt-zoo

Question

How to run openpose other models demo on tensorrt-zoo

MolianWH opened this issue 5 years ago · 8 comments

Hi, I want test other openpose models, such as face, hand. I specified the openpose face model path, but it haven't generated engine file.

env

OS: ubuntu18.04
platform: vscode
CUDA: 10.1
CUDNN: 7.6.4.38
Nvidia driver: 440.44
GPU: GeForce GTX 1070, 8G memory

Inputs

# in launch.json
...
"args": [
                "testopenpose",
                "--prototxt",
                "./bin/openpose_models/face/pose_deploy.prototxt",
                "--caffemodel",
                "./bin/openpose_models/face/pose_iter_116000.caffemodel",
                "--save_engine",
                "./bin/openpose_engine/face/openpose.engine",
                "--input",
                "/home/dreamdeck/Documents/code/test/tensorrt-zoo/bin/input/COCO_val2014_000000000241.jpg",
                "--run_mode",
                "0"
            ],
...

Details

I reset input_dim in pose_deploy.prototxt also 480 and 640. Here is it.

input: "image"
input_dim: 1
input_dim: 3
input_dim: 480 # Original: 368
input_dim: 640 # Original: 368
...

Errors

It report Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output. and [error] read create engine file . I found it didn't generate engine file.
Here is the output report.

usage: path/to/testopenpose --prototxt path/to/prototxt --caffemodel path/to/caffemodel/ --save_engine path/to/save_engin --input path/to/input/img --run_mode 0/1/2
[2020-05-08 13:48:04.354] [info] create plugin factory
[2020-05-08 13:48:04.354] [info] yolo3 params: class: 1, netSize: 416 
[2020-05-08 13:48:04.354] [info] upsample params: scale: 2
[2020-05-08 13:48:04.354] [info] prototxt: ./bin/openpose_models/face/pose_deploy.prototxt
[2020-05-08 13:48:04.354] [info] caffeModel: ./bin/openpose_models/face/pose_iter_116000.caffemodel
[2020-05-08 13:48:04.354] [info] engineFile: ./bin/openpose_engine/face/openpose.engine
[2020-05-08 13:48:04.354] [info] outputBlobName: 
net_output 
[2020-05-08 13:48:04.354] [info] build caffe engine with ./bin/openpose_models/face/pose_deploy.prototxt and ./bin/openpose_models/face/pose_iter_116000.caffemodel
[2020-05-08 13:48:04.689] [info] Number of network layers: 106
[2020-05-08 13:48:04.689] [info] Number of input: 
Input layer: 
image : 3x480x640 
[2020-05-08 13:48:04.689] [info] Number of output: 1
Output layer: 
net_output : 71x60x80 
[2020-05-08 13:48:04.689] [info] parse network done
[2020-05-08 13:48:04.689] [info] fp16 support: false
[2020-05-08 13:48:04.689] [info] int8 support: true
[2020-05-08 13:48:04.689] [info] Max batchsize: 1
[2020-05-08 13:48:04.689] [info] Max workspace size: 10485760
[2020-05-08 13:48:04.689] [info] Number of DLA core: 0
[2020-05-08 13:48:04.689] [info] Max DLA batchsize: 268435456
[2020-05-08 13:48:04.689] [info] Current use DLA core: 0
[2020-05-08 13:48:04.689] [info] build engine...
Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
Detected 1 inputs and 1 output network tensors.
[2020-05-08 13:48:39.917] [info] serialize engine to ./bin/openpose_engine/face/openpose.engine
[2020-05-08 13:48:39.917] [info] save engine to ./bin/openpose_engine/face/openpose.engine...
[2020-05-08 13:48:43.591] [error] read create engine file ./bin/openpose_engine/face/openpose.engine failed
[2020-05-08 13:48:43.592] [info] create execute context and malloc device memory...
[2020-05-08 13:48:43.592] [info] init engine...
[2020-05-08 13:48:43.593] [info] malloc device memory
nbBingdings: 2
[2020-05-08 13:48:43.593] [info] input: 
[2020-05-08 13:48:43.593] [info] binding bindIndex: 0, name: image, size in byte: 3686400
[2020-05-08 13:48:43.593] [info] binding dims with 3 dimemsion
3 x 480 x 640   
[2020-05-08 13:48:43.595] [info] output: 
[2020-05-08 13:48:43.595] [info] binding bindIndex: 1, name: net_output, size in byte: 1363200
[2020-05-08 13:48:43.595] [info] binding dims with 3 dimemsion
71 x 60 x 80   
=====>malloc extra memory for openpose...
heatmap Dims3
heatmap size: 1 71 60 80
allocate heatmap host and divice memory done
resize map size: 1 71 240 320
kernel size: 1 71 240 320
allocate kernel host and device memory done
peaks size: 1 25 128 3
allocate peaks host and device memory done
=====> malloc extra memory done
[2020-05-08 13:48:43.688] [info] net forward takes 91.2896 ms
inference Time : 94.946 ms
[1] + Done                       "/usr/bin/gdb" --interpreter=mi --tty=${DbgTerm} 0<"/tmp/Microsoft-MIEngine-In-2zjhhvwr.c77" 1>"/tmp/Microsoft-MIEngine-Out-gg24uk5d.g89"

Questions

Should I change the dims in prototxt file and what is based on to change them?
Is the error related to GPU memory?

Thanks!

Answer 1 · 2020-05-08T06:37:26.000Z

It report Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.

Don't worry about this

[error] read create engine file

https://github.com/zerollzeng/tiny-tensorrt/blob/eb0d3a7a7ed7e8894f8b897cef6bada64e736331/Trt.cpp#L218

And last, the openpose I test in tensorrt-zoo was only the body keypoint model. I'm not sure if other model has the same post-processing phase.

Answer 2 · 2020-05-08T07:50:36.000Z

Thanks. I debug and found the error that I didn't create folder. The engine file is created successfully. But there is no points output on result.jpg

Answer 3 · 2020-05-08T07:56:34.000Z

Try check the post-processing code

Answer 4 · 2020-05-08T08:03:44.000Z

Do I need to rewrite DoInference function and BodyPartConnector.cu file? I found poseKeypoints variable is null. connectBodyPartsCpu function seems write for body 25.

Answer 5 · 2020-05-09T06:25:49.000Z

Do I need to rewrite DoInference function and BodyPartConnector.cu file? I found poseKeypoints variable is null. connectBodyPartsCpu function seems write for body 25.

Yes, I'm afraid so. but this might requires you modify all post-processing include bodypartconnetct.cu, posenms.cu, etc. the way I create the openpose example is I borrow the post-processing code from openpose source code, make a little modification so it can manipulate with tensorrt output. you can take a shot.

Answer 6 · 2020-05-11T03:22:11.000Z

Thanks. I will have a look.
And I test other models engine, likes coco_18, net forward takes 127.127ms, and other openpose models also take 100+ms, except body 25 that you friendly provide takes 53.9651 ms.
Have you tested coco_18? Does tiny-tensorrt support it?

Answer 7 · 2020-05-11T04:27:28.000Z

I haven't test coco_18 model, but for the inference speed, take a loot at https://github.com/CMU-Perceptual-Computing-Lab/openpose/blob/master/doc/faq.md#difference-between-body_25-vs-coco-vs-mpi for reference

Answer 8 · 2020-05-15T09:05:09.000Z

Close due to inativity