facebookresearch/stopes

eng_twl_short_porn.txt missing (toxicity filtering NLLB)

gordicaleksa opened this issue · 0 comments

Hi folks!

I noticed that your toxicity filter is using eng_twl_short_porn.txt (it's hardcoded in the ToxicityFilterConfig in stopes/stopes/pipelines/filtering/configs.py) but I can't find that file anywhere?

Nor it is mentioned in the NLLB paper?

Am I missing something out here?

Thanks in advance!