KinyaTTS

A codebase for Kinyarwanda speech synthesis (text-to-speech) based on MB-iSTFT-VITS2 model in PyTorch. The original codebase and description comes from MB-iSTFT-VITS2 which itself is a hybrid combination of vits2_pytorch and MB-iSTFT-VITS.

An architectural depiction of the model is presented below and its details can be found the original sources.

Getting started

Inference

Checkout the code in the Inference directory and install monotonic_align the kinyatts modules

pip install -e  ./Inference/monotonic_align/
pip install -e ./Inference/

Download the pre-trained Kinyarwanda TTS model TTS_MODEL_ms_ktjw_istft_vits2_base_1M.pt
Go to the kinyatts sub-directory and run an inference server using uwsgi

cd Inference/kinyatts/
nohup sh run.sh &

Alternatively use the provided Jupiter notebook to synthetise speech: Inference/kinyatts/kinyatts_inference.ipynb

Training

Follow instructions in Training codebase to install requirements and train a basic multi-speaker TTS model.

Credits

FENRlR/MB-iSTFT-VITS2
jaywalnut310/vits
p0p4k/vits2_pytorch
MasayaKawamura/MB-iSTFT-VITS
ORI-Muchim/PolyLangVITS
tonnetonne814/MB-iSTFT-VITS-44100-Ja
misakiudon/MB-iSTFT-VITS-multilingual

agent87/KinyaTTS

KinyaTTS

Getting started

Inference

Training

Credits