/LS-KD-compatibility

[ICML 2022] This work investigates the compatibility between label smoothing (LS) and knowledge distillation (KD). We suggest to use an LS-trained teacher with a low-temperature transfer to render high performance students.

Primary LanguagePythonMIT LicenseMIT

No issues in this repository yet.