Open Text to Speech Server

Unifies access to multiple open source text to speech systems and voices for many languages, including:

eSpeak
- Supports huge number of languages/locales, but sounds robotic
flite
- English (19)
- Hindi (1)
- Bengali (1)
- Gujarati (3)
- Kannada (1)
- Marathi (2)
- Punjabi (1)
- Tamil (1)
- Telugu (3)
Festival
- English (9), Spanish (1), Catalan (1), Czech (4)
nanoTTS
- English (2), German (1), French (1), Italian (1), Spanish (1)
MaryTTS
- English (7), German (3), French (4), Italian (1), Russian (1), Swedish (1), Telugu (1), Turkish (1)
- External server required (Docker image)
- Add --marytts-url command-line argument
Mozilla TTS
- English (1)
- External server required (Docker image, amd64 only)
- Add --mozillatts-url command-line argument

Running

Basic OpenTTS server:

$ docker run -it -p 5500:5500 synesthesiam/opentts

Visit http://localhost:5500

For HTTP API test page, visit http://localhost:5500/api/

Exclude eSpeak (robotic voices):

$ docker run -it -p 5500:5500 synesthesiam/opentts --no-espeak

Adding MaryTTS and Mozilla TTS

Run using docker compose with MaryTTS and Mozilla TTS:

version: '2'
services:
  opentts:
    image: synesthesiam/opentts
    ports:
      - 5500:5500
    command: --marytts-url http://marytts:59125 --mozillatts-url http://mozillatts:5002
    tty: true
  marytts:
    image: synesthesiam/marytts:5.2
    tty: true
  mozillatts:
    image: synesthesiam/mozilla-tts
    tty: true

Visit http://localhost:5500 and choose language en then voices starting with marytts: or `mozillatts:

NOTE: Mozilla TTS docker image only runs on amd64 platforms (no Raspberry Pi).

HTTP Endpoints

See swagger.yaml

GET /api/tts
- ?voice - voice in the form tts:voice (e.g., espeak:en)
- ?text - text to speak
- Returns audio/wav bytes
GET /api/voices
- Returns JSON object
- Keys are voice ids in the form tts:voice
- Values are objects with:
  - id - voice identifier for TTS system (string)
  - name - friendly name of voice (string)
  - gender - M or F (string)
  - language - 2-character language code (e.g., "en")
  - locale - lower-case locale code (e.g., "en-gb")
  - tts_name - name of text to speech system
- Filter voices using query parameters:
  - ?tts_name - only text to speech system(s)
  - ?language - only language(s)
  - ?locale - only locale(s)
  - ?gender - only gender(s)
GET /api/languages
- Returns JSON list of supported languages
- Filter languages using query parameters:
  - ?tts_name - only text to speech system(s)

Voice Samples

See samples directory. eSpeak samples are not included since there are a lot of languages (and they all sound robotic).

alexbarcelo/opentts

Open Text to Speech Server

Running

Adding MaryTTS and Mozilla TTS

HTTP Endpoints

Voice Samples