Audio samples can be found here: online demo
All synthesized stimuli can be accessed here.
Traning data can be found here.
You can directly run the TTS models (Tacotron2 and WaveGlow) on Google Colab (with a powerful GPU).
torch == 1.1.0
- Download pre-trained Mandarin models at this folder.
- Download pre-trained Chinese BERT (
BERT-wwm-ext, Chinese
). - Run ``inference_bert.ipynb''
Or:
Use the following command line.
python synthesize.py --text ./stimuli/tone3_stimuli --use_bert --bert_folder path_to_bert_folder
--tacotron_path path_to_pre-trained_tacotron2 --waveglow_path path_to_pre-trained_waveglow
--out_dir path_output_dir
Note. The current implementation is based on the Nvidia's public implementation of Tacotron2 and Waveglow
This project has benefited immensely from the following works.
Pre-Trained Chinese BERT with Whole Word Masking
Tacotron 2 - PyTorch implementation with faster-than-realtime inference
WaveGlow: a Flow-based Generative Network for Speech Synthesis
A Demo of MTTS Mandarin/Chinese Text to Speech FrontEnd
Open-source mandarin speech synthesis data
只用同一声调的字可以造出哪些句子?