dmis-lab/BioSyn

what if there is no word in topk is correct? the loss could be infinite?

Closed this issue · 2 comments

Hello, there. After read the related paper, I got a question about the loss calculation. Formula 7 in the paper pointed out the definition of the marginal probability of the positive synonyms of a mention m. What if all of the top-k synonyms don't satisfy
EQUAL(m, n) = 1
then the marginal probability could be zero. And in formula 8, log 0 could be infinite, which seems like problematic.
Looking forward to your reply ~~

Hi @flyangovoyang

That's a really great point!
For those cases, we filtered out samples with zero marginal probabilities.
https://github.com/dmis-lab/BioSyn/blob/master/src/biosyn/rerankNet.py#L107

Oops, the question occurred to me since I didn't see any further explanation in the paper. Now that the code has covered this special case, everything is OK, thank you for your time~