This is a Speech-to-Text application for Whatsapp that uses Whatsapp-Web.js running on Docker and supports different speech recognition systems:
- Whisper
- OpenWhisper (locally)
- FasterWhisper (locally)
- OpenAI Whisper (online API)
- Google Speech-to-Text
Once authenticated on Whatsapp Web, the worker will transcribe all voice messages either automatically or when you reply to the voice message with the command !tran. Currently, it is configured to only transcribe messages from contacts saved in your contact book.
If you want to contribute, just send a pull request.
Just reply to the voice message you want to transcribe with !tran.
You can also turn on/off automatic transcription globally or per chat via chat commands. Simply send !help in the chat to get an overview of the available commands. The bot commands can only be used by you.
- To build the images, choose your compose file and run
docker compose build
- To run the containers run
docker compose up -d
- Display the logs for the node container with
docker logs -f --tail 100 speech_to_text_node
to see the QR code which is required to authenticate against the WhatsApp web client.
Check the detailed documentation of the different Speech-to-text implementations:
- Persist configuration state of the bot based on chat message input
- Currently, if transcription is disabled within a single chat this information will be lost when the service restarts.