em dash character crashes French pipeline
pa-nlp opened this issue · 0 comments
pa-nlp commented
I tested trankit with the base and large models using the French pipeline and the em dash (character unicode 8212) causes the model to crash. The online demo seems to have the same problem. A quick replace on the input string to change to an hyphen avoid this issue. I did not test the three other types of dashes, nor with other languages.