szagoruyko/attention-transfer

My Imagenet replication results are poor

Closed this issue · 0 comments

Hello, first of all, thank you very much for your great work!

When I reproduce the results of your paper, I found several confusing problems in the relevant part of Imagenet, and the results are much worse than those mentioned in the paper.

  • First of all, the accuracy of the resnet34 teacher network you mentioned in the paper is different from that of the resnet34 pre training model you provided. I don't know whether it is this reason that leads to the poor results of students. Can you provide the resnet34 model mentioned in the paper?

  • The second point is that when you do the experiment of Imagenet, you mentioned that the super parameters used are the same as those used in the migration experiment, but no specific value is given. What's the specific beta value, please?

Here is my recurrence("Imagenet_AT" is the experimental result with beta set to 1000, which is much worse than the result in the paper. "Imagenet_AT2000" is the result after I tried to adjust the beta to 2000. You know that this experiment is very computationally expensive, so I stopped the experiment after observing that the previous result is very poo):
image
Result in your paper
image