The output for CRF loss

Question

The output for CRF loss

mxc19912008 opened this issue 5 years ago · 5 comments

mxc19912008 commented 5 years ago

Hi Allan,

Can I ask what is returned in the CRF forward function?(loss) here:

https://github.com/allanj/pytorch_lstmcrf/blob/f7ef24ae16a1a015df26ca1b8bdecada087afa43/model/neuralcrf.py#L46-L57

Is it nll loss? But it seems it is quite large and not like nll loss.

Thanks again!

Answer 1 · 2019-08-29T21:24:19.000Z

Yes.
For CRF, it is a negative log-likelihood.
Mathematically, the probability p(y|x) is:

If we take the negative loglikelihood, it becomes:

So the left-hand side is unlabeled_score, the right-hand side is labeled_score. And we make a subtraction.

You can also take an average (mean) to make the loss smaller but I guess it doesn't change a lot the performance.

Answer 2 · 2019-08-29T21:33:27.000Z

I see, thanks again! :)

Answer 3 · 2019-08-30T15:50:01.000Z

Hi Allan,

I have one more question if you don't mind.
If I change the loss like this:

p = torch.exp(labeled_score - unlabed_score)
return -torch.log(p)

it should work because unlabed_score - labeled_score is the nll loss, then probability should be expressed as above, and then I do -torch.log, it should generate the same nll loss, but instead it generates "nan" at each epoch.

Thanks, Allan!

Answer 4 · 2019-08-30T16:38:46.000Z

Because initially, the labeled score (in log space) could be something like -180
unlabeled score could be: 557

(I debugged inside and check the value).

In that case, when you take the exp operation, it would be really close to zero, probably exceed the floating-point boundary in python/torch.
It's like torch.exp(-1000) = 0, and torch.exp(-500) still = 0.

Thus, log(0) gives you nan. That's why we prefer to work on log space directly

Answer 5 · 2019-08-31T01:32:40.000Z

Thanks Allan! I wanted to add something to the softmax, before log. I'll do more experiments :)