Kozea/Pyphen

Why jonathan can not be hyphenated?

pengyu opened this issue · 1 comments

See the following example. jonathan is not hyphenated. Does anybody know what is wrong?

$ ./main1.py 
jonathan
$ cat ./main1.py 
#!/usr/bin/env python
# vim: set noexpandtab tabstop=2 shiftwidth=2 softtabstop=-1 fileencoding=utf-8:

import pyphen
dic = pyphen.Pyphen(lang='en')
print dic.inserted('jonathan')
liZe commented

Thanks for reporting!

In languages such as English, syllabification is based on etymological principles. That's why there's no way to "guess" how a word can be split if it's not in the dictionary, and unfortunately it looks like the English dictionary provided with LibreOffice doesn't have first names and other proper nouns in it.

For other languages like French, syllabification is based on phonetic principles. You'll then get syllabification for proper nouns and even for words that don't exist:

>>> import pyphen
>>> dic = pyphen.Pyphen(lang='fr')
>>> print(dic.inserted('jonathan'))
jo-na-than
>>> print(dic.inserted('supercalifragilisticexpialidocious'))
su-per-ca-li-fra-gi-lis-ti-cex-pia-li-do-cious

The only solution is to ask for proper nouns in the English dictionary, you can fill a ticket in the Document Foundation bugzilla and hope to find a very motivated contributor 😉.