/glados-tts

A GLaDOS TTS, using Forward Tacotron and HiFiGAN. Inference is fast and stable, even on the CPU. A low quality vocoder model is included for mobile use. Rudimentary TTS script included. Works perfectly on Linux, partially on Maybe someone smarter than me can make a GUI.

Primary LanguagePythonMIT LicenseMIT

GLaDOS Text-to-speech (TTS) Voice Generator

Neural network based TTS Engine.

If you want to just play around with the TTS, this works as stand-alone.

python3 glados-tts/glados.py

the TTS Engine can also be used remotely on a machine more powerful then the Pi to process in house TTS: (executed from glados-tts directory

python3 engine-remote.py

Default port is 8124 Be sure to update settings.env variable in your main Glados-voice-assistant directory:

TTS_ENGINE_API			= http://192.168.1.3:8124/synthesize/

Training (New Model)

The Tacotron and ForwardTacotron models were trained as multispeaker models on two datasets separated into three speakers. LJSpeech (13,100 lines), and then on the heavily modified version of the Ellen McClain dataset, separated into Portal 1 and 2 voices (with punctuation and corrections added manually). The lines from the end of Portal 1 after the cores get knocked off were counted as Portal 2 lines.

Training (Old Model)

The initial, regular Tacotron model was trained first on LJSpeech, and then on a heavily modified version of the Ellen McClain dataset (all non-Portal 2 voice lines removed, punctuation added).

  • The Forward Tacotron model was only trained on about 600 voice lines.
  • The HiFiGAN model was generated through transfer learning from the sample.
  • All models have been optimized and quantized.

Installation Instruction

If you want to install the TTS Engine on your machine, please follow the steps below.

  1. Download the model files from Google Drive and unzip into the repo folder
  2. Install the required Python packages, e.g., by running pip install -r requirements.txt