MaryTTS 5.2 with HSMM Voices

MaryTTS 5.2 text to speech server and a collection of hidden semi-Markov model (HSMM) voices for various languages in a multi-platform Docker image.

Also includes txt2wav utility for command-line text to speech.

Supported Platforms:

amd64 - laptops, desktops, servers
arm/v7 - Raspberry Pi 2/3
arm64 - Raspberry Pi 3+/4

Running

To run a MaryTTS server:

$ docker run -it -p 59125:59125 synesthesiam/marytts:5.2

You should now be able to access the server at http://localhost:59125

Beware that this may consume a lot of RAM on a Raspberry Pi!

Restricting Voices

You can control which voices are loaded with -v or --voice arguments:

$ docker run -it -p 59125:59125 synesthesiam/marytts:5.2 --voice cmu-slt-hsmm --voice cmu-rms-hsmm

This will only loaded the necessary JARs for the specified voices, which may help conserve RAM on a Raspberry Pi.

A list of voices can be obtained with:

$ docker run -it synesthesiam/marytts:5.2 --voices

Command-Line Utility

The txt2wav utility is included in the Docker image. A bash wrapper script allows you to list voices and includes only the necessary JARs to reduce start-up time.

Copy the included txt2wav script to somewhere in your $PATH and mark it executable. This script runs the Docker image as the current user, maps your $HOME directory, and sets the working directory to $PWD.

List available voices:

$ txt2wav

voice	language
...

Generate WAV file:

$ txt2wav --voice cmu-slt-hsmm -o /path/to/tts.wav 'Welcome to the world of speech synthesis!'

Play directly:

$ txt2wav 'Welcome to the world of speech synthesis!' | aplay

Online Mode

You can run txt2wav in an "online" mode where it will continuously read sentences from standard in, overwrite the output WAV file, and repeat the sentence back on standard out.

$ txt2wav --online -o /path/to/tts.wav

Reading sentences from stdin
...

Typing a sentence and pressing <ENTER> will overwrite /path/to/tts.wav and print the sentence back on standard out. To end the session, press CTRL+D.

Voices