Add Speech-to-Text backend for coqui-STT

Question

Add Speech-to-Text backend for coqui-STT

lw64 opened this issue 2 years ago · 4 comments

It seems to me, that the coqui-STT project has reached a point, where it can be used as a backend. There are lots of languages available, and the performance is also very good: "it is running in realtime on a raspberry pi 4 core".

It has also the capability of streaming speech recognition, but as far as I know, that is not yet supported/used anywhere else.

I don't know if a server like for the deepspeech backend, or direct usage of coqui-STT's python bindings is better.

Answer 1 · 2022-01-07T00:13:47.000Z

There's a move to plugin format for the voice services, and this should be one of the supported types soon.

Answer 2 · 2022-02-15T22:38:03.000Z

Coqui STT would be a straight-forward drop-in replacement for DeepSpeech, because the APIs are nearly identical :D

also - the latest English model from Coqui STT is much more accurate than the old DeepSpeech model

Answer 3 · 2022-04-13T09:12:38.000Z

I'm running Coqui STT on my Picroft as described here (as a REST API the same way DeepSpeech is currently integrated into Mycroft).
I needed it to quickly work somehow so it might not be the best solution but maybe it is helpful anyway for someone planning to do it right.

Answer 4 · 2022-04-14T10:32:43.000Z

@hslr4 maybe you could create a pull request for the integration into mycroft?