Understanding cascading of sizes in mtcnn
sidgan opened this issue · 0 comments
sidgan commented
Hi,
Im trying to follow through the code and understand how mtcnn works. I understand that for each image, for each scale the detection comes from each of the networks. In particular I am talking about the Pnet right now.
# Code file: mtcnn_detector.py
local_boxes = self.Pool.map( detect_first_stage_warpper, izip(repeat(img), self.PNets[:len(batch)], [scales[i] for i in batch], repeat(self.threshold[0])) )
The image is rescaled according to the scales produced earlier and the rescaled image (now called input_buf
) goes into the Pnet.
# Code file: helper.py
# ORIGINAL Height: 340
# ORIGINAL Width: 151
# SCALE USED (were computed before): 0.107493555074
# RESCALED Height: 37
# RESCALED Width: 17
output = net.predict(input_buf)
For reference I have printed out the original size and the rescaled size.
The net
corresponds to Pnet and in det1.prototxt (PNet) the input size should have h=12 and w=12.
# Code file: det1.prototxt
input_dim: 1
input_dim: 3
input_dim: 12
input_dim: 12
What I don't understand is where is the size going from size of input_buf to 12x12?