PKUZHOU/MTCNN_FaceDetection_TensorRT

Intuition of replacing PReLU

mibrahimy opened this issue · 9 comments

Can you explain how is Scaling, applying ReLU and then again Scaling and elementwise addition equivalent to PReLU?

PReLU keeps original values when inputs are positive, and multiplies a scale factor (the factors are trained previously ) when inputs are negative, so you can use ReLU( 1*x ) + Scale_2( ReLU(Scale_1(x))) where Scale_1 multiplies -1 to x to make the original negative values to be positive, and sequential ReLU operation keeps these values. Scale_2 multiplies scale factors which are trained previously, note that the scale factors are multiplied -1 before written into weight file, which guarantees the original negative parts to be negative again.

Thank you for the insight. @PKUZHOU

Hi @PKUZHOU,
Can you provide the script of transforming PReLU into the combination of ReLu, Scaling, and Elemwise-Add?
I think it will be very helpful to generalize this method to the other network to attain higher performance!

Hi @PKUZHOU,
Can you provide the script of transforming PReLU into the combination of ReLu, Scaling, and Elemwise-Add?
I think it will be very helpful to generalize this method to the other network to attain higher performance!

Sorry, the script is so simple that I didn't keep it after converting MTCNN weights, and I haven't done CNN acceleration jobs for a long time, so I have no plan to rewrite a script. In fact if you are familiar with the data format and layout of caffe models, it is quite easy to write a script to convert Prelu under the way of what I have explained.

Okay! I had implemented it in python to replace the PRelu layer in prototxt and copy weight to caffemodel with one command. Will place onto once clean up unrelated code.

@kketernality Can you share your converted code?Thanks

@xiexuliunian Made a gist for the PReLU conversion, since I needed to do it too :
https://gist.github.com/Helios77760/c1317a3f791617c5dbc8cdce071c9576

@Helios77760 Thanks.

@Helios77760 thanks for your code, but when i run this script, i got this "Cannot copy param 0 weights from layer 'PReLU_1'; shape mismatch. Source param shape is (1); target param shape is 32 (32). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer." do you have any suggestions?thanks