This is voice assistant prototype.
Frontend html + css + js
Backend flask + redis
Supported features
- Voice or text input. Speech to text conversion is done with Google Cloud Speech API
- Translation from Russian to English using Google Translate API
- Translation history (can be removed)
- Voice output. Speech synthesis is implemented with gTTS
-
For performing queries to Google Speech API the key needed. Possible options:
- Set up authentication by yourself (see tutorial for details). It's free.
- Contact me and I'll share my key (unfortunately I cannot share the key publicly)
-
To compose docker image:
docker build -t voice-assistant .
docker-compose up
- Due to security reasons audio recording with navigator.mediaDevices.getUserMedia cannot be performed on non-security connections (including localhost). To resolve issue, generate SSL for localhost. Then after docker-compose run
localhost:5000
I have not found any other way for audio to be recorded or played on localhost (if you run 0.0.0.0:5000 audio will not be recorded and played).
- Sometimes Google Translate API doesn't recognize queries for unknown reasons. You'll get 'Not recognized' output. Important: this is not application error, this is error response from Google's API.