Picovoice/speech-to-text-benchmark

feature request: compare with more ASR engines

solyarisoftware opened this issue · 3 comments

Hi all, just to say thank you for the benchmark.
I propose to add more speech recognition engine to you tests. Here below some engines to add in your benchmark:

  1. WIT.ai
    official API doc: https://wit.ai/docs/http/20170307#post__speech_link
    https://www.liip.ch/en/blog/speech-recognition-with-wit-ai

  2. IBM Watson speech to text
    official API doc: https://www.ibm.com/watson/services/speech-to-text/
    https://www.pragnakalp.com/speech-recognition-speech-to-text-python-using-google-api-wit-ai-ibm-cmusphinx/

  3. Microsoft Cognitive Service Speech To text
    official API doc: https://azure.microsoft.com/en-us/services/cognitive-services/speech-to-text/

  4. Kaldi
    official API doc: https://kaldi-asr.org/doc/about.html

thanks again
giorgio

Thanks for the comment. (4) is hard as Kaldi is basically a toolkit rather than a ready to go ASR. Hence the performance depends on how one trains it. Makes sense? (1-3) looks like good candidates. Though I expect the performance to be similar to Amazon and Google. If you get to integrate them into the framework and they don't cost much money to run (5 hours of LibriSpeech dataset) happy to merge a PR :)

Thanks Alireza for your feedback. My expectation/hypothesis is that mentioned ASRs give worst WER compared with Google Speech. I'm working with the Wit.ai now. I'll try to push a PR.

BTW, please mark this open issue as a feature request/other or close it if you prefer.

Thanks
giorgio

Thanks a lot. I will close the issue for now. If you add an engine to this please submit a PR and happy to review/merge.