ssb22/lexconvert

Converting "born" from IPA to espeak is causing a mispronunciation.

Closed this issue · 1 comments

If you convert "born" in ipa (bˈɔːɹn) to espeak, the result is [[b'O:r3:n]], which is pronounced "boren". The issues appears to be the code to add an implicit vowel before NL, which drops the added vowel if the next letter is a consonant or a syllable separator, but not if it's the end of the word.

Adding this line of code after line 2859 fixes this specific problem, but I'm not sure if it opens up other cans of worms:
if ret[-2].endswith(maybe_bytes('*added',ret[-2])): del ret[-2]

Thanks. In British English, “born” is bɔːn (I can't say bˈɔːɹn). ESpeak is fundamentally British: its original author Jonathan Duddington lived in Coventry, and I helped him a bit over email from Cambridge. Espeak does have a US voice option that was added later, but I'm not sure if it will handle all American phoneme combinations the way you'd expect: we didn't have an American on the team although we did have a bit of user feedback. After Jonathan died and the Espeak-NG fork took over, it's possible they might have improved its American pronunciation, but I haven't really been keeping track of all developments in the NG version.

lexconvert's implicit_vowel_before_NL logic was originally added to help with conversions from the British Festival lexicon format, because Festival adds an implicit əː before any final N or L if the thing before it was a consonant (and R counts as a consonant), and many existing Festival lexicon entries assumed this behaviour. The very first version of Lexconvert was simply a Festival to eSpeak converter; the other formats were added later. I did think of disabling the implicit_vowel_before_NL logic when the source format is not Festival, but I later found a couple of cases where the behaviour was required even when the source was not Festival's lexicon, although sadly I didn't make a note of the exact examples I found, because I never expected that logic to do any harm.

I'm now tempted to provide a "don't add implicit vowels" user option (command line switch or environment variable), plus an information message that gets printed if implicit vowels were added to your input, telling you how to turn off that behaviour. After all, the user probably knows better than me if their particular case needs them or not. Otherwise I'd be figuratively going mad trying to refine the exact set of circumstances when that rule should or should not come into play depending on the source format ("don't do it if it's unicode-ipa") or previous phonemes ("don't do it if it's certain vowels + R") - we could go that way but it will need a lot more checking, so perhaps a "hey, I just added an implicit vowel, did you want to turn that off" approach might be better in the short term....