Tacotron over MXNet

A tech demo of MXNet capabilities consisting of a Tacotron implementation. This is a work in progress.

This project was made during the 8 weeks from 10-2017 to 12-2017 at the PiCampus AI School in Rome.

List of functionalities and TODOs

  • Multithreading data iterator
  • DSP tools
  • CBHG module for spectrograms
  • Basic seq2seq example for string reverse. It we'll be used as Tacotron backbone
  • Encoder with CBHG
  • Attention model
  • Custom decoder for processing r * mel_bands spectrograms frames for each time step during the cell unrolling
  • Switch to MXNet 1.0
  • Switch to Gluon
  • Clean up and organize code for better understanding

Getting Started

  • install MXNet: pip install -r requirements.txt
  • run: python tacotron.py

Using the default setting, a simple dataset will be used as training. Predictions samples will be generated at the end of the training phase.

If you want to train over a big dataset, Kyubyong has cut and formatted this English bible. You can find his dataset here and the CSV text here .

Prerequisites

This project has been developed on

  • MXNet 0.12
  • librosa

Authors

This project was developed by Alberto Massidda and Stefano Artuso during Pi School's AI programme in Fall 2017.

photo of Alberto Massidda photo of Stefano Artuso

Acknowledgments