Phonemes missing in German phonemizer in Espeak-NG Backend
Pranjalya opened this issue · 1 comments
Describe the bug
On passing the phonemizer with Espeak-NG backend on German words, some of the words had double question marks (??) in place of the phonemes. On further testing, I found it to occur specifically with words containing 'ur' substring.
Phonemizer version
phonemizer-3.2.1
available backends: espeak-ng-1.50, segments-2.2.1
System
Ubuntu Docker Image, Python3.9. Found it to be same in Python3.8 as well.
Issue persisted in Ubuntu 22.04 both with standard Python and Anaconda environment.
To reproduce
phonemizer_backend = EspeakBackend(language='de',
punctuation_marks=';:,.!?¡¿—…"«»“”~/。【】、‥،؟“”؛',
preserve_punctuation=True,
language_switch='remove-flags',
with_stress=True)
print(phonemizer_backend.phonemize(["frankfurt"], strip=True))
Expected behavior
The phones should not be missing and proper phonemes should appear instead of ??.
Hi, this is a espeak related problem, not a phonemizer one :
By the way, the problem is not present with espeak-1.48
so you can try with it:
$ espeak-ng --version
eSpeak NG text-to-speech: 1.50 Data at: /usr/lib/x86_64-linux-gnu/espeak-ng-data
$ espeak-ng --ipa -v de frankfurt
frˈaŋkf??t
$ espeak --version
eSpeak text-to-speech: 1.48.15 16.Apr.15 Data at: /usr/lib/x86_64-linux-gnu/espeak-data
$ espeak --ipa -v de frankfurt
frˈaŋkfʊɐt