bkj opened this issue 6 years ago · 0 comments
On this line you're applying a softmax to the similarities.
Then later you apply cross_entropy, which is a log softmax + NLL loss.
cross_entropy
I think you probably want to remove the first softmax.
~ Ben