Kyubyong/dc_tts

Usage guide / tutorial ?

JoranDox opened this issue · 2 comments

Hi,

I've had okay results with Keithito's tacotron implementation, but I wanted to try this too (and your tacotron 2 implementation too but I suppose the question fits both equally)

Could you give a short guide on what to do to run this model from scratch? I've got my own dataset modeled after the Keithito's LJSpeech dataset so anything that works with that should work on my data.

On top of that, could you share your weights / checkpoints? I've noticed that, even though my dataset is in Dutch, it worked on only a few hundred iterations on top of Keithito's English tacotron training weights. Mainly the alignment was hard / slow to train from scratch I believe.

Thanks!

I'll upload the pretrained model once the training is over. To up and run the code, just follow the instructions in README.md. If it doesn't work, please let me know again.

Ah I see the issue now, the LJSpeech dataset used to have a file called metadata.csv, which has since been renamed to transcripts.csv.

I thought you had some extra preprocessing step transforming one into the other but all it takes is a name change :D