MaryTTS 5.2 text to speech server and a collection of hidden semi-Markov model (HSMM) voices for various languages in a multi-platform Docker image.
Also includes txt2wav utility for command-line text to speech.
Supported Platforms:
amd64
- laptops, desktops, serversarm/v7
- Raspberry Pi 2/3arm64
- Raspberry Pi 3+/4
To run a MaryTTS server:
$ docker run -it -p 59125:59125 synesthesiam/marytts:5.2
You should now be able to access the server at http://localhost:59125
Beware that this may consume a lot of RAM on a Raspberry Pi!
You can control which voices are loaded with -v
or --voice
arguments:
$ docker run -it -p 59125:59125 synesthesiam/marytts:5.2 --voice cmu-slt-hsmm --voice cmu-rms-hsmm
This will only loaded the necessary JARs for the specified voices, which may help conserve RAM on a Raspberry Pi.
A list of voices can be obtained with:
$ docker run -it synesthesiam/marytts:5.2 --voices
The txt2wav utility is included in the Docker image. A bash wrapper script allows you to list voices and includes only the necessary JARs to reduce start-up time.
Copy the included txt2wav script to somewhere in your $PATH
and mark it executable.
This script runs the Docker image as the current user, maps your $HOME
directory, and sets the working directory to $PWD
.
List available voices:
$ txt2wav
voice language
...
Generate WAV file:
$ txt2wav --voice cmu-slt-hsmm -o /path/to/tts.wav 'Welcome to the world of speech synthesis!'
Play directly:
$ txt2wav 'Welcome to the world of speech synthesis!' | aplay
You can run txt2wav
in an "online" mode where it will continuously read sentences from standard in, overwrite the output WAV file, and repeat the sentence back on standard out.
$ txt2wav --online -o /path/to/tts.wav
Reading sentences from stdin
...
Typing a sentence and pressing <ENTER>
will overwrite /path/to/tts.wav
and print the sentence back on standard out. To end the session, press CTRL+D
.