Recognized words differed of recognized phonemes.

Question

Recognized words differed of recognized phonemes.

timkarlo opened this issue 4 years ago · 3 comments

Recognizing the words gives this output
pocketsphinx\bin\Release\x64>pocketsphinx_continuous.exe -infile "C:\1timkarlo\ai_short16000.wav" -hmm model\en-us\en-us -lm model\en-us\en-us.lm.bin -dict model\en-us\cmudict-en-us.dict -logfn nul

the question of how real is artificial intelligence is

recognizing the phone this out
pocketsphinx_continuous.exe -infile "C:\1timkarlo\ai_short16000.wav" -hmm model\en-us\en-us -allphone yes -backtrace yes -beam 1e-20 -pbeam 1e-20 -lw 2.0 -logfn nul
SIL B IH V K AO R SH IH N EH V EH HH AW R R TH R IY OW L HH SIL EY IH Z SIL

the words in this file: cmudict-en-us.dict are
the DH AH
question K W EH S CH AH N
of AH V
how HH AW
real R IY L

is IH Z

Where this difference comes from?

Answer 1 · 2022-09-29T11:35:28.000Z

This may have been due to the endianness bug in language models affecting the phone LM

Answer 2 · 2022-09-29T12:43:12.000Z

Possibly, but allphone search gives different (and worse) results, because the language model is less powerful. This is, unfortunately, just the way it works.

Answer 3 · 2022-09-29T12:44:09.000Z

Especially in this case, where you don't actually have a language model over the phonemes. Many things sound the same, especially to a not very good acoustic model!

You can try wav2vec if you want to get state of the art phoneme recognition. You might need a few more gigabytes though :-)