Publish the baseline results for each language pair?
ftyers opened this issue · 4 comments
It would be useful in terms of sanity checking the results we get. E.g. If we have the baseline set up correctly.
It might also be good to publish the random seed for this purpose too.
I'll update this issue as we get them:
Pair | Epochs | Converged? | chrF2 | BLEU |
---|---|---|---|---|
Spanish→Aymara | 20 | No | 0.176 | 0.96 |
Spanish→Aymara | 30 | Yes | 0.211 | 1.94 |
Spanish→Bribri | 30 | ? | 0.239 | 8.85 |
Spanish→Nahuatl | 30 | ? | 0.276 | 5.21 |
Spanish→Hñähñu | 30 | ? | 0.228 | 4.17 |
Spanish→Quechua | 30 | Yes | 0.343 | 12.60 |
Spanish→Shipibo-Konibo | 30 | Yes | 0.174 | 0.38 |
Spanish→Raramuri | 30 | ? | 0.242 | 5.32 |
Spanish→Wixarika | 30 | Yes | 0.296 | 14.33 |
Using first 200 sents from the training sets as dev and second 200 as test and the remainder as train.
Tank you a lot. This is a good idea. Next week we will publish the values of the baseline. I have a question regarding Aymara. Why do you have two experimentes with Aymara?
No problem! :) As for the two values, we did one with 20 epochs, but noticed that the loss on the dev set didn't converge, so we tried with 30 and it seemed to converge.
Sorry, Just saw that this issue was still open. The baseline for all languages is online. :) Thanks a lot!