Download data for specific language pair
bricksdont opened this issue · 1 comments
Dear Jörg and colleagues
I would like to download the training and validation data used for a specific language pair and setting (one that is already covered by a pre-trained Tatoeba model). Example model I'd like to download the data for:
https://github.com/Helsinki-NLP/Tatoeba-Challenge/tree/master/models/deu-eng
If possible, I'd like to avoid downloading the entire OPUS collection for all language pairs and settings.
Thanks for your help!
P.S. Will close this issue if I figure out how to do it
Found a way to do it:
Look here:
https://github.com/Helsinki-NLP/Tatoeba-Challenge/blob/master/Data.md
to identify a link to a TAR file, such as
https://object.pouta.csc.fi/Tatoeba-Challenge/deu-eng.tar
I am assuming that's the correct way :)