Mobilenet implementation

Question

Mobilenet implementation

Opened this issue 7 years ago · 14 comments

Hello, i tried to deploy mobilenet-ssd (https://github.com/chuanqi305/MobileNet-SSD), and have a problem with the output shape
"OUTPUT1 Tensor Shape is: C: 21 H: 1917 W: 1"
It should be 1917 21 1
Do u have any clue?

Answer 1 · 2018-08-13T21:42:21.000Z

@NguyenHongHanh @saikumarGadde
My mobilenet-ssd shape output is

INPUT Tensor Shape is: C: 3 H: 300 W: 300
mbox_conf_softmax Tensor Shape is: C: 21 H: 1917 W: 1
mbox_loc Tensor Shape is: C: 7668 H: 1 W: 1
mbox_priorbox Tensor Shape is: C: 2 H: 7668 W: 1
OUTPUT Tensor Shape is: C: 1 H: 100 W: 7

But detection results are bad, classification and detection are both wrong.
Did you got the correct inference result yet?

Answer 2 · 2018-08-14T02:56:52.000Z

For mobilenet-ssd, I have got the correct inference result . The mbox_priorbox Tensor Shape is: C: 21 H: 1917 W: 1，and OUTPUT Tensor Shape is: C: 1 H: 100 W: 7

Answer 3 · 2018-08-14T03:11:25.000Z

@JingliangGao
I have fixed vgg-ssd issue(Thanks again), and now failed to parse mobilenet-ssd prototxt/weight.
Could you share your mobilenet-ssd implementation to me using email/baidupan?

Answer 4 · 2018-08-14T06:10:17.000Z

@JingliangGao
So looks like my mbox_priorbox shape is incorrect?
I'm using TensorRT4.0 with CUDA8, chuanqi305's weight, chenzhi1992's plugin prototxt and this @saikumarGadde's implementation with some modification. I use group convolution for depthwise convolution, so no extra implementation of depthwise conv.
It would be greatly appreciated if you can share some insight on how to get the correct result!
My email: matyih2004@gmail.com
感谢~

Answer 5 · 2018-08-14T06:22:49.000Z

Well...Check out each layer's shape and make sure that it may be same to the Caffe output . You will get the correct result .

Answer 6 · 2018-08-14T20:50:46.000Z

@JingliangGao Thank you.
The output shapes are the same with Caffe's output.

Here's a few things I not certain about,
could you confirm that I can use group conv for depthwise conv to achieve correct result?
Second thing is that my detection_out setting is:
createSSDDetectionOutputPlugin({true, false, 0, 21, 400, 200, 0.5, 0.45, CodeTypeSSD::CENTER_SIZE, {0,1,2}, false, true})
where I am not sure about the last two parameters as they are not present in Caffe.
I tested all the combination without success, but I want to make sure which one is correct. Btw this setting works well for VGGSSD.
Thanks!

Answer 7 · 2018-08-15T00:14:07.000Z

TensorRT 3 or 4 with cudnn 7 can support group convolution but not support depthwise convolution derectly.

Maybe you should understand createSSDDetectionOutputPlugin first because some parameters you set are not right.

Answer 8 · 2018-08-15T00:20:10.000Z

@JingliangGao Thanks for the reply.
Sorry I referenced the wrong code (from vgg-ssd),
this is the one I'm trying with mobilenet-ssd TensorRT4:
createSSDDetectionOutputPlugin({true, false, 0, 21, 100, 100, 0.25, 0.45, CodeTypeSSD::CENTER_SIZE, {0,1,2}, false, true})
Thanks

Answer 9 · 2018-08-23T23:34:31.000Z

@JingliangGao my understanding is that cudnn 7 optimizes group wise convolutions when n_channels/n_groups is equal to 1, 2 or 4. I have actually tested this myself and dug through the documentation.

Answer 10 · 2018-08-24T05:35:27.000Z

@paghdv So that's why I got 6ms for mobilenet-ssd on GTX1080 (I couldn't test caffe's speed because there's some bugs when using cudnn)

Answer 11 · 2018-08-24T06:20:07.000Z

11.5ms for mobilenet-ssd on GTX1050Ti
Thanks @myih

Answer 12 · 2018-10-16T03:19:59.000Z

@JingliangGao
My mobilenet-ssd shape output is

INPUT Tensor Shape is: C: 3 H: 300 W: 300
mbox_conf_softmax Tensor Shape is: C: 1917 H:21 W: 1
mbox_loc Tensor Shape is: C: 7668 H: 1 W: 1
mbox_priorbox Tensor Shape is: C: 2 H: 7668 W: 1
OUTPUT Tensor Shape is: C: 1 H: 100 W: 7

And I compared them with Caffe output .
the output of mbox_conf_softmax and mbox_loc is same with Caffe output,but the mbox_priorbox output is different with Caffe output.
In tensorRT,mbox_priorbox Shape is C:2 H:7668 W:1
In Caffe,mbox_priorbox Shape is C:15336 H:7668 W:1

Then,I printed all output of priorbox layer
conv11_mbox_priorbox
conv13_mbox_priorbox
conv14_2_mbox_priorbox
conv15_2_mbox_priorbox
conv16_2_mbox_priorbox
conv17_2_mbox_priorbox
I found that all of priorbox output C channel are different with Caffe priorbox output C channel.
all of Tensorrt priorbox output C channel are 2

tensorRT vgg-ssg run successed,could get correct result.
And I printed the output of priorbox,also the priorbox Shape C are 2,different with Caffe output
so,tensortRT output shape different with Caffe output shape, is is normal?

tensorrt mobileNet-ssd get incorrect result
plese help me ,Thanks

Answer 13 · 2018-10-16T06:26:32.000Z

@Ghustwb your output shape should be correct,
when I was getting wrong result it's because the image loader is different in chuanqi305's repo, make sure you change that part

Answer 14 · 2018-10-16T10:03:13.000Z

@myih Thanks for your reply
The part of image loader was different with chuanqi305,and I changed it.
But there are still problems here，detection_out output
0 15 0.98 0.249136 0.45 0.251 0.45
It semms ,classfication and conf ware creect,the bbox must wrong.

This problem has been bothering me for a long time...

INPUT Tensor Shape is: C: 3 H: 300 W: 300 mbox_conf_softmax Tensor Shape is: C: 1917 H:21 W: 1 mbox_loc Tensor Shape is: C: 7668 H: 1 W: 1 mbox_priorbox Tensor Shape is: C: 2 H: 7668 W: 1 OUTPUT Tensor Shape is: C: 1 H: 100 W: 7

INPUT Tensor Shape is: C: 3 H: 300 W: 300
mbox_conf_softmax Tensor Shape is: C: 1917 H:21 W: 1
mbox_loc Tensor Shape is: C: 7668 H: 1 W: 1
mbox_priorbox Tensor Shape is: C: 2 H: 7668 W: 1
OUTPUT Tensor Shape is: C: 1 H: 100 W: 7