Text to speech task that clones a custom voice in end-to-end manner.
We're using Tacotron 2, WaveGlow and speech embeddings(WIP) to acheive this.
Simply run /usr/bin/bash setup.sh
to create conda environment, install dependencies and activate it.
To start, ensure you have the following
- Anaconda or Miniconda is already installed
- Linux distro (tested on Ubuntu 20.04) with NVidia GPU