Add Speech-to-Text backend for coqui-STT
lw64 opened this issue · 4 comments
It seems to me, that the coqui-STT project has reached a point, where it can be used as a backend. There are lots of languages available, and the performance is also very good: "it is running in realtime on a raspberry pi 4 core".
It has also the capability of streaming speech recognition, but as far as I know, that is not yet supported/used anywhere else.
I don't know if a server like for the deepspeech backend, or direct usage of coqui-STT's python bindings is better.
There's a move to plugin format for the voice services, and this should be one of the supported types soon.
Coqui STT would be a straight-forward drop-in replacement for DeepSpeech, because the APIs are nearly identical :D
also - the latest English model from Coqui STT is much more accurate than the old DeepSpeech model