huggingface/pytorch-openai-transformer-lm

why it didn't use softmax in computing multichoice loss

eveliao opened this issue · 1 comments

the input are just logits, not normalized by softmax, why can we directly compute the cross entropy loss with it and y?

clf_losses = self.clf_criterion(clf_logits, Y)

cross entropy includes log_softmax function, sorry...