Vietnamese Text to Speech

Forward Tacotron trained Vietnamese Dataset.
This work is heavily based on ForwardTacotron. For a more detailed description, please see the original authors README.

Dataset:

pip install -r requirements.txt

python demo.py --text "Nhập một đoạn văn bản bất kì" --output model_outputs

Can be found in pretrained directory:

(1) Preprocess the dataset:

python preprocess.py --path /path/to/dataset

(2) Train Tacotron with:

python train_tacotron.py

(3) Use the trained tacotron model to create alignment features with:

python train_tacotron.py --force_align

(4) Train ForwardTacotron with:

python train_forward.py

For training the model, just bring it to the LJSpeech format:

|- dataset_folder/
|   |- metadata.csv
|   |- wav/
|       |- file1.wav
|       |- ...

Or refer to the original repo ForwardTacotron for more in depth details.