google/cld3

Increase the number of supported languages

Opened this issue · 0 comments

Hi!
Do you have any plans to increase the number of supported languages up to 200-300?
The languages like: Chuvash (chv), Mari (mhr), Hill Mari (mrj), Komi (kpv), which have presence in the web, are not included here. And hence are not in multilingual C4 dataset.