[BUG] CoAtNet_0 Model different from paper
karam-nus opened this issue · 1 comments
CoAtNet_0 model defined in paper has 5 repeating RelTransformer blocks in stage S3, where as timm implementation has 7.
We also see a difference in Top1 for this model.
Top1 in paper : 81.2
Top1 reported on HF model card: 82.39
Actual top1 on IN-1k: 78.87
Steps to reproduce the behavior:
- Get the model from HF/TIMM :
pt_model = timm.create_model('coatnet_0_rw_224.sw_in1k', pretrained=True)
- Validate on imagenet-1k validation set
Model accuracy reported in HF documentation should be 78.87 and not 82.39.
@karam-nus I'm well aware, the models have rw
in the name because they're my spin on the models. There are many comments, pointers in the code. This isn't the only difference.
pytorch-image-models/timm/models/maxxvit.py
Line 1514 in 7160af4
pytorch-image-models/timm/models/maxxvit.py
Lines 1340 to 1389 in 7160af4
There are more paper-like models but I never trained any,
pytorch-image-models/timm/models/maxxvit.py
Lines 1648 to 1653 in 7160af4
If you aren't within +/- .1-.2 of the official eval results (https://github.com/huggingface/pytorch-image-models/blob/main/results/results-imagenet.csv) your eval is wrong.