Wavebender GAN:

Deep architecture for high-quality and controllable speech synthesis through interpretable features and exchangeable neural synthesizers

Gustavo Teodoro Döhler Beck, Ulme Wennberg, Zofia Malisz, Gustav Eje Henter

This is the official code repository for the paper Wavebender GAN: An architecture for phonetically meaningful speech manipulation.

For audio examples, visit our demo page.

Data

All the 13100 audio samples from the LJ speech data set should be stored in data/wavs/. Then they should be split and the results should be stored in wavebender_features_data/train/, wavebender_features_data/test/. In these folders there are .txt files with the corresponding audios filed for each data set.

Tacotron 2

Before start training you need to download Tacotron2 and save in the main folder waveglow_256channels_universal_v5.pt and have it in the tacotron2 folder as well.

Training

Wavebender Net and GAN are trained separetelly. Therefore, you can train each one of them by running train_wavebender_net.py or train_wavebender_gan.py. Don't forget to have the data already in the correct format to run them.

gustavo-beck/wavebender-gan

Wavebender GAN:

Deep architecture for high-quality and controllable speech synthesis through interpretable features and exchangeable neural synthesizers

Gustavo Teodoro Döhler Beck, Ulme Wennberg, Zofia Malisz, Gustav Eje Henter

Data

Tacotron 2

Training