Question on en-hi test set
Closed this issue · 3 comments
Hi, congratulations on your paper!
I am working on word alignment between en and hi. I found there are two en-hi test sets provided by this link, i.e., en-hi.wa, en-hi.wa.nonnullalign. Which test set is used in the paper?
My test results on en-hi (using subword embeddings):
en-hi.wa.nonnullalign:
XLM-R Argmax prec=85.62 rec=46.91 f1=60.61 AER=39.39
XLM-R IterMax prec=75.36 rec=51.88 f1=61.45 AER=38.55
en-hi.wa:
XLM-R Argmax prec=85.62 rec=36.32 f1=51.00 AER=49.00
The reported results in paper is:
XLM-R Argmax f1=60 AER=40
So, do you use en-hi.wa.nonnullalign as the test set?
Hi,
Thank you.
Yes, we use the "en-hi.wa.nonnullalign", since we don't generate null alignments in the output (We just skip them).
If you look at Table 5, in the supplementaries, the XLM-R Itermax is also reported there.
Thank your so much!
BTW, I cannot find the other word alignment test sets except en-hi & en-de. Could you share the test sets or send me a copy?
Links will be in the the camera ready version. I also added them to the Readme.