ArchipLab-LinfengZhang/Object-Detection-Knowledge-Distillation-ICLR2021

some difference between the paper and the provided code

jy-Hamlet opened this issue · 0 comments

Hello I find some settings described in the paper are different from that in the code.
In the paper, they are:

model α β γ T
one-stage 4e-4 2e-2 4e-4 0.5
two-stage 7e-5 4e-3 7e-5 0.1

But in the code, they are:

model α β γ T
one-stage 2e-2 4e-4 4e-4 0.1
two-stage 4e-3 7e-5 7e-5 0.5

Which version is accurate?