Unifies access to multiple open source text to speech systems and voices for many languages, including:
- eSpeak
- Supports huge number of languages/locales, but sounds robotic
- flite
- English (19)
- Hindi (1)
- Bengali (1)
- Gujarati (3)
- Kannada (1)
- Marathi (2)
- Punjabi (1)
- Tamil (1)
- Telugu (3)
- Festival
- English (9), Spanish (1), Catalan (1), Czech (4)
- nanoTTS
- English (2), German (1), French (1), Italian (1), Spanish (1)
- MaryTTS
- English (7), German (3), French (4), Italian (1), Russian (1), Swedish (1), Telugu (1), Turkish (1)
- External server required (Docker image)
- Add
--marytts-url
command-line argument
- Mozilla TTS
- English (1)
- External server required (Docker image,
amd64
only) - Add
--mozillatts-url
command-line argument
Basic OpenTTS server:
$ docker run -it -p 5500:5500 synesthesiam/opentts
Visit http://localhost:5500
For HTTP API test page, visit http://localhost:5500/api/
Exclude eSpeak (robotic voices):
$ docker run -it -p 5500:5500 synesthesiam/opentts --no-espeak
Run using docker compose with MaryTTS and Mozilla TTS:
version: '2'
services:
opentts:
image: synesthesiam/opentts
ports:
- 5500:5500
command: --marytts-url http://marytts:59125 --mozillatts-url http://mozillatts:5002
tty: true
marytts:
image: synesthesiam/marytts:5.2
tty: true
mozillatts:
image: synesthesiam/mozilla-tts
tty: true
Visit http://localhost:5500 and choose language en
then voices starting with marytts:
or `mozillatts:
NOTE: Mozilla TTS docker image only runs on amd64
platforms (no Raspberry Pi).
See swagger.yaml
GET /api/tts
?voice
- voice in the formtts:voice
(e.g.,espeak:en
)?text
- text to speak- Returns
audio/wav
bytes
GET /api/voices
- Returns JSON object
- Keys are voice ids in the form
tts:voice
- Values are objects with:
id
- voice identifier for TTS system (string)name
- friendly name of voice (string)gender
- M or F (string)language
- 2-character language code (e.g., "en")locale
- lower-case locale code (e.g., "en-gb")tts_name
- name of text to speech system
- Filter voices using query parameters:
?tts_name
- only text to speech system(s)?language
- only language(s)?locale
- only locale(s)?gender
- only gender(s)
GET /api/languages
- Returns JSON list of supported languages
- Filter languages using query parameters:
?tts_name
- only text to speech system(s)
See samples directory. eSpeak samples are not included since there are a lot of languages (and they all sound robotic).