NLLB mined data?
gordicaleksa opened this issue · 1 comments
gordicaleksa commented
Hi!
Did you ever release the mined data behind the NLLB project? (As mentioned in the paper's section 5.4 that's roughly 1.1B sentence pairs)
Thank you!
gordicaleksa commented
Apologies missed the metadata readme that you already released.
I also found that AllenAI replicated the dataset and released it to HuggingFace: https://huggingface.co/datasets/allenai/nllb