why it didn't use softmax in computing multichoice loss
eveliao opened this issue · 1 comments
eveliao commented
the input are just logits, not normalized by softmax, why can we directly compute the cross entropy loss with it and y?
pytorch-openai-transformer-lm/loss.py
Line 22 in bfd8e09
eveliao commented
cross entropy includes log_softmax function, sorry...