About "reduction" built in KLDivLoss
junfish opened this issue · 0 comments
junfish commented
The reason why your temperature is bigger than the original paper setting (said T = 2) may be caused by KLDivLoss. You may try to set reduction = "batchmean" in KLDivLoss. Just a guess. Welcome others to discuss.