minwookchang/pytorch-WaveGAN

A pytorch implementation of WaveGAN

Jupyter Notebook

pytorch-WaveGAN (Not Yet Completed)

A pytorch implementation of WaveGAN (Donahue, et al. 2018).

WaveGAN is first approach to synthesize raw audio using GAN.

Features

Overall architecture is based on DCGAN
2D Conv(5,5) -> 1D Conv(1,25)
original DCGAN output size is 4096. Add one layer to make output size larger(16384).
16384 is slightly more than 1 second raw audio of 16kHz
change audio data 16-bit to 32=bit floating point
Train a post-processing filter to aviod checkerboard effects
Phase shuffle to avoid for Discriminator not to train checkerboard effects