Grammar Bug: Sometimes words are only partially recognized
LazoCoder opened this issue · 0 comments
Problem
If I have a word like escape
in my grammar, sometimes whisper will output the first few letters esc
instead of the whole word. The expected behavior is that only the entire word should be recognized.
How to Reproduce (example 1)
Go into examples/command
and make a simple single line grammar root ::= " escape"
. Now if you say "escape" it will sometimes print out esc
instead of the whole word escape
. You can also try to say "essk" and that will also print out esc
but the expected behavior would be to print nothing. This is an invalid command.
How to Reproduce (example 2)
Another example is to set the grammar to root ::= " caps"
. If you say "cap" it will print out cap
(without the s
). The expected behavior should be to print nothing because cap
is an invalid command, only caps
(with the s
) should be accepted.
My Setup
I'm running examples/command
with my custom grammar on a Window 10 machine via GPU/CUDA and I get the same problem whether I use ggml-small
or ggml-large-v2
.
Temporary Workaround Issue
I can remove invalid words in post processing but the problem is that these erroneous words prematurely cut off recognition of any other commands which should come after. For example, if I have a long list of commands like "please escape and log out", if escape
is incorrectly outputted as esc
then everything that comes after that command will be omitted from the output.
Notes
I noticed user @ulatekh also experienced this problem #2127 (comment) #2047 (comment). I created this issue as a response to this comment #2127 (comment).