Wrong shapes of internal layers
Closed this issue · 12 comments
Hi,
I`m trying to reproduce your method, and I'm not able to start training the model.
After running train.py module, such error occurs:
mxnet.base.MXNetError: MXNetError: Error in operator pre_fc1: Shape inconsistent, Provided = [512,25088], inferred shape=(512,100352)
basically, falling is on this line
outputs = [net(X) for X in data]
I believe that can be because caused by wrong weights/model architecture, but I'm not sure, where you have taken weights for ArcFace model. I have tried to download them from the official Model-Zoo GitHub page and insightface.ai website, but both of them raises that error.
I'm totally new to MXNet and ArcFace, and would be grateful for your help!
Hello!
I used the model which you can get via insightface.model_zoo.get_model('arcface_r100_v1')
. Should be the same as from the website.
I believe the problem is with the image's inputs size. I used face detection and alignment as preprocessing and only passed those images where faces were detected. Such images would be 112x112
.
P.S. Sorry that the repository is a mess, I will try to help with the reproduction as much as I can.
Hi,
Oh, yeah, I've noticed that you didn't use detection and alignment in the training loop, but I did not paid much attention to it. I used prepare_images
method from the verification.py module to perform preprocessing, seems training is going well now.
I'll let you know if I was able to reproduce the results.
Thank you a lot!
Glad to be of help.
Closing this issue.
Hi,
I'm having a similar issue too,
mxnet.base.MXNetError: Error in operator fc_classification: Shape inconsistent, Provided = [571,512], inferred shape=(570,571)
how to fix?
would be grateful for your help
Be sure that your input tensor size is [batch_size, 3, 112, 112]
.
OK. If you load already trained models (from this repository) instead of the pre-trained model from the insightface, does the error still occur?
I used the trained models which I downloaded from this link you gived
Oh, I get it now. You're trying to provide the full model to the training pipeline. It won't work because of additional layers.
The training pipeline is expecting the model to output a vector with 512 features.
You can try to remove the last layers from the trained model (the output of the last BatchNorm
should be fine) and try again.
Yes,it is woking now.
But I am confused about what the last layer is designed for?
Family classification. You can see that layer is added inside the train.py
.
Thank you very much.My English is not good .Thank you for your patient.