Wrong shapes of internal layers

Question

Wrong shapes of internal layers

Closed this issue 4 years ago · 12 comments

Hi,

I`m trying to reproduce your method, and I'm not able to start training the model.

After running train.py module, such error occurs:

mxnet.base.MXNetError: MXNetError: Error in operator pre_fc1: Shape inconsistent, Provided = [512,25088], inferred shape=(512,100352)

basically, falling is on this line

outputs = [net(X) for X in data]

I believe that can be because caused by wrong weights/model architecture, but I'm not sure, where you have taken weights for ArcFace model. I have tried to download them from the official Model-Zoo GitHub page and insightface.ai website, but both of them raises that error.

I'm totally new to MXNet and ArcFace, and would be grateful for your help!

Answer 1 · 2021-02-28T00:14:03.000Z

Hello!

I used the model which you can get via insightface.model_zoo.get_model('arcface_r100_v1'). Should be the same as from the website.
I believe the problem is with the image's inputs size. I used face detection and alignment as preprocessing and only passed those images where faces were detected. Such images would be 112x112.

P.S. Sorry that the repository is a mess, I will try to help with the reproduction as much as I can.

Answer 2 · 2021-02-28T09:05:15.000Z

Hi,

Oh, yeah, I've noticed that you didn't use detection and alignment in the training loop, but I did not paid much attention to it. I used prepare_images method from the verification.py module to perform preprocessing, seems training is going well now.

I'll let you know if I was able to reproduce the results.

Thank you a lot!

Answer 3 · 2021-02-28T10:42:54.000Z

Glad to be of help.
Closing this issue.

Answer 4 · 2023-02-16T03:39:32.000Z

Hi,

I'm having a similar issue too,

mxnet.base.MXNetError: Error in operator fc_classification: Shape inconsistent, Provided = [571,512], inferred shape=(570,571)

how to fix?
would be grateful for your help

Answer 5 · 2023-02-16T04:41:07.000Z

Be sure that your input tensor size is [batch_size, 3, 112, 112].

Answer 6 · 2023-02-16T06:26:09.000Z

I print X.shape
it is 48 * 3* 112 *112

Answer 7 · 2023-02-16T07:08:57.000Z

OK. If you load already trained models (from this repository) instead of the pre-trained model from the insightface, does the error still occur?

Answer 8 · 2023-02-16T07:13:49.000Z

I used the trained models which I downloaded from this link you gived

https://disk.yandex.ru/d/4AbxLjTa3fsG7g

Answer 9 · 2023-02-16T09:03:44.000Z

Oh, I get it now. You're trying to provide the full model to the training pipeline. It won't work because of additional layers.
The training pipeline is expecting the model to output a vector with 512 features.

You can try to remove the last layers from the trained model (the output of the last BatchNorm should be fine) and try again.

Answer 10 · 2023-02-17T13:18:14.000Z

Yes,it is woking now.
But I am confused about what the last layer is designed for?

Answer 11 · 2023-02-17T17:45:24.000Z

Family classification. You can see that layer is added inside the train.py.

Answer 12 · 2023-02-20T07:10:48.000Z

Thank you very much.My English is not good .Thank you for your patient.