facebookresearch/AVT

Loading pretrained ViT base as a backbone

ofir1080 opened this issue · 3 comments

Hi there!
I was wondering what is the difference between the ViT ckpt file you supplied instead of simply loading timm.vit_base_patch16_224 with using timm.create_model with the pretrained=True flag. Basically they were both pretrained on IN1k. Am I correct?
Thanks!

Hi,
Which VIT files are you referring to? For initializing I download the TIMM checkpoints anyway (https://github.com/facebookresearch/AVT#training-and-evaluating-models) which would be equivalent to timm.create_model with pretrained=True. The final AVT model is trained end to end so the final VIT model would be different from the TIMM initialization.

Thanks for the comment!
I was just wondering what was the different between using the pretrained ViTs pyth files instead of setting pretrained=True.
If get it right, these are exactly the same. Do I?

Thanks!

Yes they are. I just used the other way of initializing as it is a bit more convenient to switch to other pretrained models etc.