why it didn't use softmax in computing multichoice loss

the input are just logits, not normalized by softmax, why can we directly compute the cross entropy loss with it and y?

Line 22 in bfd8e09

clf_losses = self.clf_criterion(clf_logits, Y)

cross entropy includes log_softmax function, sorry...