Loading pre-trained model

Question

Loading pre-trained model

Maha59 opened this issue a year ago · 4 comments

Hello, Thank you work sharing your work.

I'm attempting to load a pre-trained model using the following code:

from fastervit import create_model

model = create_model('faster_vit_0_224', pretrained = '/checkpoint_path/fastervit_0_224_1k.pth.tar')

However, I'm encountering an error message which suggests that the checkpoint I'm trying to load is incompatible with the model architecture : ' Unexpected key(s) in state_dict: "epoch", "arch", "state_dict", "optimizer", "version", "args", "amp_scaler", "metric"'. This typically occurs when the model architecture does not match the architecture of the pretrained model, which may be due to differences in the number or types of layers, layer parameters, naming of layers, or the number of output classes. Could you please assist me in resolving this issue?

Kind regards,

Answer 1 · 2023-06-23T16:07:16.000Z

Hi, I also met the same problem， my input size is [256,704],and i wanna use pretrained fastervit_2_224_1k.pth.tar with model name "faster_vit_2_any_res", the error is:RuntimeError: Error(s) in loading state_dict for FasterViT:
Missing key(s) in state_dict: "levels.3.blocks.0.hat_norm1.weight", "levels.3.blocks.0.hat_norm1.bias", "levels.3.blocks.0.hat_norm2.weight", "levels.3.blocks.0.hat_norm2.bias", "levels.3.blocks.0.hat_attn.qkv.weight", "levels.3.blocks.0.hat_attn.qkv.bias", "levels.3.blocks.0.hat_attn.proj.weight". Thanks a lot.

Answer 2 · 2023-06-24T15:24:33.000Z

Hi ，do different input sizes require training different models? Otherwise, the shapes of "levels.2.blocks.0.attn.pos_emb_funct.relative_bias" and "levels.2.blocks.0.hat_attn.pos_emb_funct.relative_coords_table" may be different.",right?

Answer 3 · 2023-06-28T18:02:09.000Z

Thanks @Maha59 , @pianogGG for raising this issue.

Answer 4 · 2023-06-29T18:52:18.000Z

Issue addressed by #41 thanks to @pianogGG 's efforts !