Error while Loading vit_small weights

Question

Error while Loading vit_small weights

shubhaminnani opened this issue 2 years ago · 6 comments

Hi @Xiyue-Wang ,
Thank you for the amazing repo.
I am trying to load the weights for MoCoV3 vit_small pretrained weights.

model = moco.builder_infence.MoCo_ViT(partial(vits.__dict__['vit_small'], stop_grad_conv1=True))
pretext_model = torch.load('TransPath/vit_small.pth.tar')['state_dict']
model = nn.DataParallel(model).cuda()
model.load_state_dict(pretext_model,strict=False)

But facing an error as below

I checked the keys for model and weights and seems to be same, but still above error.

Looking forward.
Thanks!

Answer 1 · 2023-06-01T02:45:12.000Z

you can try
model = moco.builder_infence.MoCo_ViT(
partial(vits.dict[args.arch], stop_grad_conv1=True))

pretext_model = torch.load(r'./vit_small.pth.tar')['state_dict']
model = nn.DataParallel(model).cuda()
model.load_state_dict(pretext_model, strict=True)?
Many people use is no problem, I do not know why you have an error

Answer 2 · 2023-06-01T14:25:25.000Z

thank you, but dict can't be defined like above as you mentioned.

Even though I tried, it gave below Error AttributeError: module 'vits' has no attribute 'dict'

Thanks

Answer 3 · 2023-06-01T14:36:17.000Z

may be remove model = nn.DataParallel(model).cuda()?

Answer 4 · 2023-06-01T15:26:25.000Z

Tried by removing that code, complete keys are a mismatch here. Not able to load any keys for that.

Answer 5 · 2023-06-01T15:36:06.000Z

Do I need to use model.module.online_encoder.net.head = nn.Identity() similar kind of code from TransPath to extract the features?

Answer 6 · 2023-06-01T20:55:13.000Z

Hi @Xiyue-Wang ,
I think I found the problem. The model you trained was with moco.builder and while doing inference you are trying to call the module from moco.builder_infence where the in forward function below code is commented out. Training weights have keys from the former module and they dont match in the moco.builder_infence.

As understanding the code, it seems feature extraction should be done from base encoder only rather the complete architecture, so for that need to find a solution. Better load the complete weights and truncate the model after that. Whats your view?

Thanks!