ThiagoCF05/webnlg

Coverage tests

AmitMY opened this issue · 2 comments

Thanks for doing all this.

I have a question regarding coverage, and if you tested your manual work's coverage.

Looking at train, 7triplets, first sentence (first file I open), I see in the first sentence:

AGENT-1 was born in PATIENT-4 and is from the U.S. . AGENT-1 graduated in 1955 from PATIENT-3 . AGENT-1 worked as PATIENT-2 and for NASA in PATIENT-6 . AGENT-1 spent PATIENT-5 in space and is now retired .

U.S. should be replaced with PATIENT-1, the entire in 1955 from UT Austin with a B.S with PATIENT-3 and retired should be replaced with PATIENT-7.

Would you say that hese kind of problems are to be expected? Did you do any coverage test to make sure you didn't leave anything? (for 2 of these cases above, an automated test can catch them)

Dear,

Thank you very much for this. We did a test coverage, but you are right. The quality of the domains Monument and Astronaut in the train/7triples are very poor and you should not expect this in the rest of the corpus. We are sorry and will fix this as soon as possible (in 2 weeks).

Dear Amit,

The problem with the 7triples instances in the training set were reviewed and fixed. In the next day we are going to run our automatic script to extract discourse information and referring expressions.