Tacotron - pytorch implementation

Prerequisite

LJ Speech 1.1, female single speaker dataset.
I follow Kyubyong's DCTTS repo with TensorFlow for preprocessing speech signal data. It actually worked well.

Download the above dataset and modify the path in config.py. And then run the below command. 1st arg: signal prepro, 2nd arg: metadata (train/test split)
```
python prepro.py 1 1
```
The model needs to train 100k+ steps (10+ hours).
```
python train.py
```
After training, you can synthesize some speech from text.
```
python synthesize.py
```

In speech synthesis, the attention module is important. If the model is normally trained, then you can see the monotonic attention like the follow figures.