training code for "H-AT: Hybrid Attention Transfer for Knowledge Distillation"
Primary LanguagePython