The output for CRF loss
mxc19912008 opened this issue · 5 comments
Hi Allan,
Can I ask what is returned in the CRF forward function?(loss) here:
Is it nll loss? But it seems it is quite large and not like nll loss.
Thanks again!
Yes.
For CRF, it is a negative log-likelihood.
Mathematically, the probability p(y|x)
is:
If we take the negative loglikelihood, it becomes:
So the left-hand side is unlabeled_score
, the right-hand side is labeled_score
. And we make a subtraction.
You can also take an average (mean) to make the loss smaller but I guess it doesn't change a lot the performance.
I see, thanks again! :)
Hi Allan,
I have one more question if you don't mind.
If I change the loss like this:
p = torch.exp(labeled_score - unlabed_score)
return -torch.log(p)
it should work because unlabed_score - labeled_score is the nll loss, then probability should be expressed as above, and then I do -torch.log, it should generate the same nll loss, but instead it generates "nan" at each epoch.
Thanks, Allan!
Because initially, the labeled score (in log space) could be something like -180
unlabeled score could be: 557
(I debugged inside and check the value).
In that case, when you take the exp
operation, it would be really close to zero, probably exceed the floating-point boundary in python/torch.
It's like torch.exp(-1000) = 0, and torch.exp(-500) still = 0.
Thus, log(0) gives you nan. That's why we prefer to work on log space directly
Thanks Allan! I wanted to add something to the softmax, before log. I'll do more experiments :)