- Forward Tacotron trained Vietnamese Dataset.
- This work is heavily based on ForwardTacotron. For a more detailed description, please see the original authors README.
- Python >= 3.6
- Install dependencies:
pip install -r requirements.txt
- Run the notebook
demo.ipynb
or run the following script:
python demo.py --text "Nhập một đoạn văn bản bất kì" --output model_outputs
Can be found in pretrained
directory:
tacotron_89K.pyt
: pretrained model for Tacotronforward_300K.pyt
: pretrained model for Forward Tacotronmodel_loss0.028723_step860000_weights.pyt
: pretrained model for WaveRNN
(1) Preprocess the dataset:
python preprocess.py --path /path/to/dataset
(2) Train Tacotron with:
python train_tacotron.py
(3) Use the trained tacotron model to create alignment features with:
python train_tacotron.py --force_align
(4) Train ForwardTacotron with:
python train_forward.py
For training the model, just bring it to the LJSpeech format:
|- dataset_folder/
| |- metadata.csv
| |- wav/
| |- file1.wav
| |- ...
Or refer to the original repo ForwardTacotron for more in depth details.