code for paper "BiLD: Bi-directional Logits Difference Loss for Large Language Model Distillation"
Primary LanguagePython