Load weights

Question

Load weights

qdd1234 opened this issue 2 years ago · 9 comments

Hi, when I use torch.load(xxxx.pth) and model.load_state_dict(xx), I meet a warning as followed:
Weights of SLaK not initialized from pretrained model: ['stages.0.0.large_kernel.Decom1.conv.weight', 'stages.0.0.large_kernel.Decom1.bn.weight', 'stages.0.0.large_kernel.Decom1.bn.bias', 'stages.0.0.large_kernel.Decom1.bn.running_mean']

Weights from pretrained model not used in SLaK: ['stages.0.0.large_kernel.lkb_origin.conv.weight','stages.0.0.large_kernel.lkb_origin.bn.weight', 'stages.0.0.large_kernel.lkb_origin.bn.bias', 'stages.0.0.large_kernel.lkb_origin.bn.running_mean']

Do you use the re-parameter operation during the model save? thanks!

Answer 1 · 2022-07-12T11:39:29.000Z

Hi, to run SLaK models, we set --Decom True so we can use two decomposed kernels instead of one big MXM lkb_origin kernel. And we do not use the re-parameter operation for inference.

Answer 2 · 2022-07-12T11:42:00.000Z

Hi, to run SLaK models, we set --Decom True so we can use two decomposed kernels instead of one big MXM lkb_origin kernel. And we do not use the re-parameter operation for inference.

But keys in your pretrained weight is big MXM lkb_origin kernel instead of two decomposed kernels, what are the reasons?

Answer 3 · 2022-07-12T11:51:55.000Z

There might be some mistakes. Could you please tell us which pretrained model are you testing and providing more information about this?

Answer 4 · 2022-07-12T12:08:11.000Z

There might be some mistakes. Could you please tell us which pretrained model are you testing and providing more information about this?

we use SLaK-T. The codes we load model are as followed:

from slak_tr import SLaK_tiny as tmm
torch_model = tmm(Decom =True,width_factor =1.3)
import torch
static = torch.load('SLaK_tiny_checkpoint.pth',map_location=torch.device('cpu'))
torch_model.load_state_dict(static["model"])

The slak_tr.py is the same to the SLaK.py
when run the codes above, we meet an error as followed:
Missing key(s) in state_dict: "stages.0.0.large_kernel.Decom1.conv.weight", "stages.0.0.large_kernel.Decom1.bn.weight", "stages.0.0.large_kernel.Decom1.bn.bias", "stages.0.0.large_kernel.Decom1.bn.running_mean", "stages.0.0.large_kernel.Decom1.bn.running_var", "stages.0.0.large_kernel.Decom2.conv.weight", "stages.0.0.large_kernel.Decom2.bn.weight", "stages.0.0.large_kernel.Decom2.bn.bias", "stages.0.0.large_kernel.Decom2.bn.running_mean"..........]
Unexpected key(s) in state_dict: "stages.0.0.large_kernel.lkb_origin.conv.weight", "stages.0.0.large_kernel.lkb_origin.bn.weight", "stages.0.0.large_kernel.lkb_origin.bn.bias", "stages.0.0.large_kernel.lkb_origin.bn.running_mean", "stages.0.0.large_kernel.lkb_origin.bn.running_var", "stages.0.0.large_kernel.lkb_origin.bn.num_batches_tracked", "stages.0.1.large_kernel.lkb_origin.conv.weight", "stages.0.1.large_kernel.lkb_origin.bn.weight", "stages.0.1.large_kernel.lkb_origin.bn.bias", "stages.0.1.large_kernel.lkb_origin.bn.running_mean", "stages.0.1.large_kernel.lkb_origin.bn.running_var", "stages.0.1.large_kernel.lkb_origin.bn.num_batches_tracked", "stages.0.2.large_kernel.lkb_origin.conv.weight", "stages.0.2.large_kernel.lkb_origin.bn.weight", "stages.0.2.large_kernel.lkb_origin.bn.bias", "stages.0.2.large_kernel.lkb_origin.bn.running_mean", "stages.0.2.large_kernel.lkb_origin.bn.running_var", "stages.0.2.large_kernel.lkb_origin.bn.num_batches_tracked", "stages.1.0.large_kernel.lkb_origin.conv.weight"........]

Could you help me solve this problem？Thanks!

Answer 5 · 2022-07-12T12:30:17.000Z

Thanks for your follow-up. I indeed have reproduced your error. We have pasted the wrong link I believe. I have fixed the link just now and the model will give you 82.482 top1 acc. Could you please try again?

Answer 6 · 2022-07-12T16:06:47.000Z

Thanks for your follow-up. I indeed have reproduced your error. We have pasted the wrong link I believe. I have fixed the link just now and the model will give you 82.482 top1 acc. Could you please try again?

Sorry, I still fail to load the pretrained weight. I meet an error as followed:
Missing key(s) in state_dict: "stages.0.0.large_kernel.Decom1.conv.weight", "stages.0.0.large_kernel.Decom1.bn.weight", "stages.0.0.large_kernel.Decom1.bn.bias", "stages.0.0.large_kernel.Decom1.bn.running_mean", "stages.0.0.large_kernel.Decom1.bn.running_var", "stages.0.0.large_kernel.Decom2.conv.weight", "stages.0.0.large_kernel.Decom2.bn.weight", "stages.0.0.large_kernel.Decom2.bn.bias", "stages.0.0.large_kernel.Decom2.bn.running_mean", "stages.0.0.large_kernel.Decom2.bn.running_var", "stages.0.1.large_kernel.Decom1.conv.weight", "stages.0.1.large_kernel.Decom1.bn.weight", "stages.0.1.large_kernel.Decom1.bn.bias"......]

Unexpected key(s) in state_dict: "stages.0.0.large_kernel.LoRA1.conv.weight", "stages.0.0.large_kernel.LoRA1.bn.weight", "stages.0.0.large_kernel.LoRA1.bn.bias", "stages.0.0.large_kernel.LoRA1.bn.running_mean", "stages.0.0.large_kernel.LoRA1.bn.running_var", "stages.0.0.large_kernel.LoRA1.bn.num_batches_tracked", "stages.0.0.large_kernel.LoRA2.conv.weight", "stages.0.0.large_kernel.LoRA2.bn.weight", "stages.0.0.large_kernel.LoRA2.bn.bias", "stages.0.0.large_kernel.LoRA2.bn.running_mean", "stages.0.0.large_kernel.LoRA2.bn.running_var", "stages.0.0.large_kernel.LoRA2.bn.num_batches_tracked", "stages.0.1.large_kernel.LoRA1.conv.weight", "stages.0.1.large_kernel.LoRA1.bn.weight", "stages.0.1.large_kernel.LoRA1.bn.bias", "stages.0.1.large_kernel.LoRA1.bn.running_mean", "stages.0.1.large_kernel.LoRA1.bn.running_var"....]

I know this error may be caused by the way I load the pretrained weight. I use following codes to load the weight and I want to ask Whether this loading weight approach is reasonable

from slak_tr import SLaK_tiny as tmm
torch_model = tmm(Decom =True,width_factor =1.3)
import torch
static = torch.load('SLaK_tiny_checkpoint.pth',map_location=torch.device('cpu'))
torch_model.load_state_dict(static["model"])

Answer 7 · 2022-07-12T16:13:34.000Z

I fixed this argument inconsistence earlier today. Can you try to git pull or clone again?

Answer 8 · 2022-07-12T16:15:05.000Z

I fixed this argument inconsistence earlier today. Can you try to git pull or clone again?

OK, thanks for your quick replying. I try again

Answer 9 · 2022-07-13T02:17:10.000Z

Hi, I have successfully reproduced the SLaK-T model with PaddlePaddle framework (acc_top1: 0.8240 - acc_top5: 0.9677). Thanks for your wonderful work!