This repository contains an attempt to incorporate Rasa Chatbot with state-of-the-art ASR (Automatic Speech Recognition) and TTS (Text-to-Speech) models directly without the need of running additional servers or socket connections.
In this project, the browser is a big part as it provides access to the connected media input devices like microphones. So, I had to use a supported interface that is compatible with all mainstream browsers even with older versions. That's why I used the AudioContext()
interface. I didn't use other interfaces like MediaRecorder
because it isn't compatible with Microsoft Edge, or Safari. Also, I didn't use any other plugins like recorderJs
as it is not supported anymore.
Here is a table of the least acceptable version of each mainstream browser out there in the market:
Desktop | Mobile | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Chrome | Edge | Firefox | Internet Explorer | Opera | Safari | Android webview | Chrome for Android | Firefox for Android | Opera for Android | Safari on iOS | Samsung Internet | |
Support | ✔️ | ✔️ | ✔️ | ❌ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ |
Least Acceptable Version | 35 | Full support | 25 | No support | 22 | 6 | Full support | 35 | 26 | 22 | Full support | Full support |
- This interface doesn't work on Internet Explorer, and I need to check its status with Edge.
- The TTS interface has a problem whenever Rasa responds with more than one
text
message... It handles multiple responses perfectly as long as they have onetext
message. If Rasa responds with more than onetext
message, all thetext
messages are being played at the same time.
Special Thanks to:
- Sean Naren for training the provided ASR model.
- ESPNet organization for training all provided TTS models.
- SamimOnline for providing the early Bootstrap template
- Patrick Roberts for the synth-js JavaScript plugin.
- Remon Kamal for the technical help during this project... his guidance was at help!!