Loss Weight ablation experiment
JinYu1998 opened this issue · 3 comments
Have you tried the effect of different loss weights on the distillation results ?
Hey @JinYu1998 - we did a coarse sweep over the KL weights when setting up preliminary experiments on just the LibriSpeech corpus. We found the setting from DistilBART to be best, and so committed to this for the rest of the project. We didn't do any further tuning of the loss weights on our full training set. You can find an ablation over the loss terms (not weights) in page 26 of the paper.
Hey @JinYu1998 - we did a coarse sweep over the KL weights when setting up preliminary experiments on just the LibriSpeech corpus. We found the setting from DistilBART to be best, and so committed to this for the rest of the project. We didn't do any further tuning of the loss weights on our full training set. You can find an ablation over the loss terms (not weights) in page 26 of the paper.
Thank you for your reply, I have previously worked on dynamic temperature distillation on classifieds, and just recently finished this work. I'm very interested in distillation in whisper, and look forward to combining my work with distill whisper very well.
Delicious & Exciting Diet Foods : Weight Loss Food full video - https://youtu.be/4Kr8gtd2oss?si=1HVNwuBNTKgr4XCL