Some question about ViTKD

Question

Some question about ViTKD

peiyingxin opened this issue 2 years ago · 1 comments

Hi, thanks for sharing your great work!
I have some question about your work:

where are you get your deit3-base model? Offical model is 85.7 top1 accuracy in ImageNet-1K, paper deit3-base model is 85.48, in addition, official model state_dict is not same with your defined deit3 model state_dict, so you had modified it?
I had used vit-base model from mmcls with 85.43 top1 accuracy distill deit-small from scratch. only got 80.04 top1 accuracy, witch below baseline 80.69, deit3-base model struct is same with vit-base, I'm confused why got this result?

hoping for your reply
thank you.

Answer 1 · 2022-10-20T09:40:41.000Z

1.The weight is transferred from official weight. Beacuse of the environment, the accuracy maybe a little bit different.
2. ViT-Base is trained with 384x384. While DeiT-S is trained with 224x224. Please be more careful.