/pytorch-WaveGAN

A pytorch implementation of WaveGAN

Primary LanguageJupyter Notebook

pytorch-WaveGAN (Not Yet Completed)

A pytorch implementation of WaveGAN (Donahue, et al. 2018).

WaveGAN is first approach to synthesize raw audio using GAN.

Features

  • Overall architecture is based on DCGAN
  • 2D Conv(5,5) -> 1D Conv(1,25)
  • original DCGAN output size is 4096. Add one layer to make output size larger(16384).
  • 16384 is slightly more than 1 second raw audio of 16kHz
  • change audio data 16-bit to 32=bit floating point
  • Train a post-processing filter to aviod checkerboard effects
  • Phase shuffle to avoid for Discriminator not to train checkerboard effects