happynear/FaceVerification

DeepID model

Closed this issue · 4 comments

I have some trouble with understanding these models you gave out. Cuz I'm not familiar with how siamese network works in Caffe, I can't see why for example conv11 and conv12 have different numbers of output (asymmetric). Have you tried training without a siamese structure like what the first version of DeepID does? I've been working on this for weeks however the best result I've achieved is only about 82%. Did you also crop patches to train ensemble models?

I haven't reproduced DeepID2 yet. The paramters (loss_weigth, margin) of contrastive loss are difficult to tune. I only trained a pure softmax network and the accuracy on LFW is no less than softmax + contrastive. So I guess contrastive loss is not neccesary.

conv11 and conv12 are diffierent layers. The network is designed inspired by VGG-19 network.

Pairs of images are concated at the beginning of the net, and splitted at the end to feed into contrastive loss.

@happynear Thanks for your quick response. Do you mean you used pairs for training and didn't do any cropping like what the paper says? I'm also confused about the factor you used for scale. Are there any particular reasons to use 0.0078125 instead of 0.00390625?

Yes, I used the aligned face images, no cropping is used. Scale is not a matter. Since I have substracted the mean image, divide by 128 (1/128=0.0078125) scales the pixels' graylevel to [-1, 1].