VITA-Group/SLaK

Load weights

qdd1234 opened this issue · 9 comments

Hi, when I use torch.load(xxxx.pth) and model.load_state_dict(xx), I meet a warning as followed:
Weights of SLaK not initialized from pretrained model: ['stages.0.0.large_kernel.Decom1.conv.weight', 'stages.0.0.large_kernel.Decom1.bn.weight', 'stages.0.0.large_kernel.Decom1.bn.bias', 'stages.0.0.large_kernel.Decom1.bn.running_mean']

Weights from pretrained model not used in SLaK: ['stages.0.0.large_kernel.lkb_origin.conv.weight','stages.0.0.large_kernel.lkb_origin.bn.weight', 'stages.0.0.large_kernel.lkb_origin.bn.bias', 'stages.0.0.large_kernel.lkb_origin.bn.running_mean']

Do you use the re-parameter operation during the model save? thanks!

Hi, to run SLaK models, we set --Decom True so we can use two decomposed kernels instead of one big MXM lkb_origin kernel. And we do not use the re-parameter operation for inference.

Hi, to run SLaK models, we set --Decom True so we can use two decomposed kernels instead of one big MXM lkb_origin kernel. And we do not use the re-parameter operation for inference.

But keys in your pretrained weight is big MXM lkb_origin kernel instead of two decomposed kernels, what are the reasons?

There might be some mistakes. Could you please tell us which pretrained model are you testing and providing more information about this?

There might be some mistakes. Could you please tell us which pretrained model are you testing and providing more information about this?

we use SLaK-T. The codes we load model are as followed:

from slak_tr import SLaK_tiny as tmm
torch_model = tmm(Decom =True,width_factor =1.3)
import torch
static = torch.load('SLaK_tiny_checkpoint.pth',map_location=torch.device('cpu'))
torch_model.load_state_dict(static["model"])

The slak_tr.py is the same to the SLaK.py
when run the codes above, we meet an error as followed:
Missing key(s) in state_dict: "stages.0.0.large_kernel.Decom1.conv.weight", "stages.0.0.large_kernel.Decom1.bn.weight", "stages.0.0.large_kernel.Decom1.bn.bias", "stages.0.0.large_kernel.Decom1.bn.running_mean", "stages.0.0.large_kernel.Decom1.bn.running_var", "stages.0.0.large_kernel.Decom2.conv.weight", "stages.0.0.large_kernel.Decom2.bn.weight", "stages.0.0.large_kernel.Decom2.bn.bias", "stages.0.0.large_kernel.Decom2.bn.running_mean"..........]
Unexpected key(s) in state_dict: "stages.0.0.large_kernel.lkb_origin.conv.weight", "stages.0.0.large_kernel.lkb_origin.bn.weight", "stages.0.0.large_kernel.lkb_origin.bn.bias", "stages.0.0.large_kernel.lkb_origin.bn.running_mean", "stages.0.0.large_kernel.lkb_origin.bn.running_var", "stages.0.0.large_kernel.lkb_origin.bn.num_batches_tracked", "stages.0.1.large_kernel.lkb_origin.conv.weight", "stages.0.1.large_kernel.lkb_origin.bn.weight", "stages.0.1.large_kernel.lkb_origin.bn.bias", "stages.0.1.large_kernel.lkb_origin.bn.running_mean", "stages.0.1.large_kernel.lkb_origin.bn.running_var", "stages.0.1.large_kernel.lkb_origin.bn.num_batches_tracked", "stages.0.2.large_kernel.lkb_origin.conv.weight", "stages.0.2.large_kernel.lkb_origin.bn.weight", "stages.0.2.large_kernel.lkb_origin.bn.bias", "stages.0.2.large_kernel.lkb_origin.bn.running_mean", "stages.0.2.large_kernel.lkb_origin.bn.running_var", "stages.0.2.large_kernel.lkb_origin.bn.num_batches_tracked", "stages.1.0.large_kernel.lkb_origin.conv.weight"........]

Could you help me solve this problem?Thanks!

Thanks for your follow-up. I indeed have reproduced your error. We have pasted the wrong link I believe. I have fixed the link just now and the model will give you 82.482 top1 acc. Could you please try again?

Thanks for your follow-up. I indeed have reproduced your error. We have pasted the wrong link I believe. I have fixed the link just now and the model will give you 82.482 top1 acc. Could you please try again?

Sorry, I still fail to load the pretrained weight. I meet an error as followed:
Missing key(s) in state_dict: "stages.0.0.large_kernel.Decom1.conv.weight", "stages.0.0.large_kernel.Decom1.bn.weight", "stages.0.0.large_kernel.Decom1.bn.bias", "stages.0.0.large_kernel.Decom1.bn.running_mean", "stages.0.0.large_kernel.Decom1.bn.running_var", "stages.0.0.large_kernel.Decom2.conv.weight", "stages.0.0.large_kernel.Decom2.bn.weight", "stages.0.0.large_kernel.Decom2.bn.bias", "stages.0.0.large_kernel.Decom2.bn.running_mean", "stages.0.0.large_kernel.Decom2.bn.running_var", "stages.0.1.large_kernel.Decom1.conv.weight", "stages.0.1.large_kernel.Decom1.bn.weight", "stages.0.1.large_kernel.Decom1.bn.bias"......]

Unexpected key(s) in state_dict: "stages.0.0.large_kernel.LoRA1.conv.weight", "stages.0.0.large_kernel.LoRA1.bn.weight", "stages.0.0.large_kernel.LoRA1.bn.bias", "stages.0.0.large_kernel.LoRA1.bn.running_mean", "stages.0.0.large_kernel.LoRA1.bn.running_var", "stages.0.0.large_kernel.LoRA1.bn.num_batches_tracked", "stages.0.0.large_kernel.LoRA2.conv.weight", "stages.0.0.large_kernel.LoRA2.bn.weight", "stages.0.0.large_kernel.LoRA2.bn.bias", "stages.0.0.large_kernel.LoRA2.bn.running_mean", "stages.0.0.large_kernel.LoRA2.bn.running_var", "stages.0.0.large_kernel.LoRA2.bn.num_batches_tracked", "stages.0.1.large_kernel.LoRA1.conv.weight", "stages.0.1.large_kernel.LoRA1.bn.weight", "stages.0.1.large_kernel.LoRA1.bn.bias", "stages.0.1.large_kernel.LoRA1.bn.running_mean", "stages.0.1.large_kernel.LoRA1.bn.running_var"....]

I know this error may be caused by the way I load the pretrained weight. I use following codes to load the weight and I want to ask Whether this loading weight approach is reasonable

from slak_tr import SLaK_tiny as tmm
torch_model = tmm(Decom =True,width_factor =1.3)
import torch
static = torch.load('SLaK_tiny_checkpoint.pth',map_location=torch.device('cpu'))
torch_model.load_state_dict(static["model"])

I fixed this argument inconsistence earlier today. Can you try to git pull or clone again?

I fixed this argument inconsistence earlier today. Can you try to git pull or clone again?

OK, thanks for your quick replying. I try again

Hi, I have successfully reproduced the SLaK-T model with PaddlePaddle framework (acc_top1: 0.8240 - acc_top5: 0.9677). Thanks for your wonderful work!