anson0910/CNN_face_detection

Failed to test using deploy.prototxt of net-12

Closed this issue · 8 comments

Hi, @anson0910
I trained net-12 model using CNN_face_detection_models/face_12c/train_val.prototxt.
And I load this model with CNN_face_detection_models/face_12c/deploy.prototxt
After that, I call detect_face_12c_net in CNN_face_detection/face_detection/face_detection_functions.py. It throwed error like "inner_product_layer.cpp:64] Check failed: K_ == new_K (400 vs. 605472) Input size incompatible with inner product parameters."
I think it was caused by input image size.
In face_detection_functions.py, the original test image are resized by multi-scale, which are larger than 3_12_12. But deploy.prototxt requires 3_12_12 image as input. It seems that caffe didnt do any sliding-window job automatically

I think the detect_face_12c_net function is meant to feedforward through the fully convolutional version of the net, so if you did the training of 12-net by yourself, you can use the face_net_surgery/face_12_surgery.py script to convert face_12c to a fully convolutional network!

Otherwise, if you wish to adopt the sliding window technique, you need to crop appropriate sizes of windows from the input image, and feedfoward them once at a time.

Thanks a lot.
I should transfer model trained from train_val.prototxt using face_12_surgery.py. I tried this just now, detection works without any error. However, the result seems wrong. Is it possible that my model didnt converge? I think my trained model of face_12c has converged accoding to LOSS(=0.00003) and Accury(=1) of caffe report. Is it overfitted? I used about 20000 positive sample frome AFLW and 60000 negative samples crop from some backgroud images.
BTW: when I use your face12c_full_conv.caffemodel, it works like a charm

I observed that the detection rectangles using face12c_full_conv.caffemodel and face_12c_train_iter_400000.caffemodel are exactly the same.
In other words, face_12_surgery.py didnt make any diffrience.

Sorry, I did not encounter such a problem... not sure how to solve it

Would like to tell me how to determin the threshold of 12-net. yours is 0.01, which I think is very very low

Quoting the original paper :
"We then apply a 2-stage cascade consists of the 12-net and 12-calibration-net on a subset of the AFLW images to choose a threshold T 1 at 99% recall rate. Then we densely scan all background images with the 2- stage cascade. All detection windows with confidence score larger than T 1 become the negative training samples for the 24-net."

Basically, if you wish to have higher recall but do not care about precision, then the lower the better!

Thanks @anson0910 .I fixed the problem by replace caffe from the buggy-vesion to official-version

Great!