Implementation of Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention. Original code taken from here.
- NumPy >= 1.11.1
- TensorFlow >= 1.3 (Note that the API of
tf.contrib.layers.layer_norm
has changed since 1.3) - librosa
- tqdm
- matplotlib
- scipy
Check this colab to train the model with a dataset of yours.