Does your implementation require activation in the final layers?

Question

sarmientoj24 opened this issue 2 years ago · 2 comments

Answer 1 · 2022-10-16T17:33:53.000Z

does it work with logits or no?

Answer 2 · 2022-10-16T18:54:26.000Z

Hi sarmientoj24,

Yes, the way the loss functions are written assume the model has a softmax activation as its final layer.

Best,
Michael