Does your implementation require activation in the final layers?
sarmientoj24 opened this issue · 2 comments
sarmientoj24 commented
Does your implementation require activation in the final layers?
sarmientoj24 commented
does it work with logits or no?
mlyg commented
Hi sarmientoj24,
Yes, the way the loss functions are written assume the model has a softmax activation as its final layer.
Best,
Michael