Co-Reference

Question

Co-Reference

gamallo opened this issue 8 years ago · 8 comments

A new module for solving co-reference will be integrated by Marcos Garcia

Answer 1 · 2016-09-22T11:48:27.000Z

Any update on this? 💪

Answer 2 · 2016-09-22T14:45:18.000Z

The prototype for co-reference identification has been implemented several months ago by Marcos Garcia. He committed himself to integrate the module in Linguakit, but he hasn't a github account yet

Answer 3 · 2016-10-19T21:20:24.000Z

Module uploaded!

Answer 4 · 2016-10-24T08:57:29.000Z

I don't understand how the module works, probably because I don't understand exactly what it does. I find it confusing that there is the "coref" parameter to use the module, but there seems to be also a "-coref" parameter.

I'd do this myself, but I don't know if I'm missing something:

Correct the README.md: Where it says "COREF (parameter -coref)" should be "COREF (parameter coref)"
Parameter "-coref" (file linguakit - line 156) should not be taken into account as a valid parameter. Neither do "coref" in that very same line, because that module identifiers are taken into account previously.

Could you also add an usage example in the Examples part of the README.md?

Answer 5 · 2016-10-24T09:45:35.000Z

In a side note, maybe the module lives in the tagger subdirectory for some affinity reason, but I find that confusing too.

Answer 6 · 2016-10-24T09:48:27.000Z

Thanks! I've just corrected the README and linguakit files. Also, I modified the en.txt test file in order to show how COREF works.

If you run coref on the test (./linguakit en coref test/en.txt) you will see that NPs contain an extra column with a numerical ID. Ideally, this ID should be the same in the NPs referring to the same discourse entity (Paul = Paul_Wilson (but not Mary_Wilson); Sandra = Sandra_Curtis, etc.).

The -crnec option (experimental) uses the information provided by this kind of clustering to (try to) correct wrong NEC labels.

Answer 7 · 2016-10-24T09:52:14.000Z

Yes, it was just a NEC option in the first commit. Then it has moved to a real parameter.

Actually, it could be seen just as a NEC extension, or as a completely new NLP module.

Answer 8 · 2016-10-24T10:01:37.000Z

Thanks for the quick response, the explanation and examples. 😀
I'm closing this again, hope that's ok.