/KinyaTTS

Kinyarwanda Text to Speech

Primary LanguagePythonMIT LicenseMIT

KinyaTTS

A codebase for Kinyarwanda speech synthesis (text-to-speech) based on MB-iSTFT-VITS2 model in PyTorch. The original codebase and description comes from MB-iSTFT-VITS2 which itself is a hybrid combination of vits2_pytorch and MB-iSTFT-VITS.

An architectural depiction of the model is presented below and its details can be found the original sources. MB-iSTFT-VITS2 architecture

Getting started

Inference

  1. Checkout the code in the Inference directory and install monotonic_align the kinyatts modules
pip install -e  ./Inference/monotonic_align/
pip install -e ./Inference/
  1. Download the pre-trained Kinyarwanda TTS model TTS_MODEL_ms_ktjw_istft_vits2_base_1M.pt
  2. Go to the kinyatts sub-directory and run an inference server using uwsgi
cd Inference/kinyatts/
nohup sh run.sh &
  1. Alternatively use the provided Jupiter notebook to synthetise speech: Inference/kinyatts/kinyatts_inference.ipynb

Training

Follow instructions in Training codebase to install requirements and train a basic multi-speaker TTS model.

Credits