neulab/awesome-align

Can I use awesome-align for monosyllabic language

quocthang0507 opened this issue · 2 comments

Can I use awesome-align for monosyllabic language (e.g. Vietnamese).
For example:
Instead of:
"sinh"==="student"
"viên"==="student"
Want to:
"sinh viên"==="student"
Thanks for your project.

Hi, thanks! right now awesome-align separates a sentence into words based on white space and only supports outputting word-level alignments.

I found that your code used src.strip().split() and tgt.strip().split().
Therefore, I chose another word segmenter that supports Vietnamese and replaced it. Hope it works perfectly. 😂