AmericasNLP/americasnlp2021

Publish the baseline results for each language pair?

ftyers opened this issue · 4 comments

It would be useful in terms of sanity checking the results we get. E.g. If we have the baseline set up correctly.

It might also be good to publish the random seed for this purpose too.

I'll update this issue as we get them:

Pair Epochs Converged? chrF2 BLEU
Spanish→Aymara 20 No 0.176 0.96
Spanish→Aymara 30 Yes 0.211 1.94
Spanish→Bribri 30 ? 0.239 8.85
Spanish→Nahuatl 30 ? 0.276 5.21
Spanish→Hñähñu 30 ? 0.228 4.17
Spanish→Quechua 30 Yes 0.343 12.60
Spanish→Shipibo-Konibo 30 Yes 0.174 0.38
Spanish→Raramuri 30 ? 0.242 5.32
Spanish→Wixarika 30 Yes 0.296 14.33

Using first 200 sents from the training sets as dev and second 200 as test and the remainder as train.

Tank you a lot. This is a good idea. Next week we will publish the values of the baseline. I have a question regarding Aymara. Why do you have two experimentes with Aymara?

No problem! :) As for the two values, we did one with 20 epochs, but noticed that the loss on the dev set didn't converge, so we tried with 30 and it seemed to converge.

Sorry, Just saw that this issue was still open. The baseline for all languages is online. :) Thanks a lot!