feature request: compare with more ASR engines
solyarisoftware opened this issue · 3 comments
Hi all, just to say thank you for the benchmark.
I propose to add more speech recognition engine to you tests. Here below some engines to add in your benchmark:
-
WIT.ai
official API doc: https://wit.ai/docs/http/20170307#post__speech_link
https://www.liip.ch/en/blog/speech-recognition-with-wit-ai -
IBM Watson speech to text
official API doc: https://www.ibm.com/watson/services/speech-to-text/
https://www.pragnakalp.com/speech-recognition-speech-to-text-python-using-google-api-wit-ai-ibm-cmusphinx/ -
Microsoft Cognitive Service Speech To text
official API doc: https://azure.microsoft.com/en-us/services/cognitive-services/speech-to-text/ -
Kaldi
official API doc: https://kaldi-asr.org/doc/about.html
thanks again
giorgio
Thanks for the comment. (4) is hard as Kaldi is basically a toolkit rather than a ready to go ASR. Hence the performance depends on how one trains it. Makes sense? (1-3) looks like good candidates. Though I expect the performance to be similar to Amazon and Google. If you get to integrate them into the framework and they don't cost much money to run (5 hours of LibriSpeech dataset) happy to merge a PR :)
Thanks Alireza for your feedback. My expectation/hypothesis is that mentioned ASRs give worst WER compared with Google Speech. I'm working with the Wit.ai now. I'll try to push a PR.
BTW, please mark this open issue as a feature request/other or close it if you prefer.
Thanks
giorgio
Thanks a lot. I will close the issue for now. If you add an engine to this please submit a PR and happy to review/merge.