Why using `log_softmax` instead of `softmax`?
nguyenvulong opened this issue · 1 comments
nguyenvulong commented
Same question has been asked here and here . These repositories (I think you already know them) are other attempts to implement knowledge distillation algorithms.
Could you please explain why it used log_softmax
instead of softmax
?
torchdistill/torchdistill/losses/single.py
Lines 99 to 106 in 993ee94
yoshitomo-matsubara commented
See KLDivLoss in PyTorch document.
To avoid underflow issues when computing this quantity, this loss expects the argument input in the log-space. The argument target may also be provided in the log-space if log_target= True.
Also, please use Discussions above (instead of Issues) for questions.
As explained here, I want to keep Issues mainly for bug reports.