
This repository contains implementations of GANSynth, WaveGAN and SpecGAN, more details on the report.

Primary LanguagePython


Adversarial Audio Synthesis

This report summarizes the findings of the original Adversarial Audio Synthesis paper and shows a reproduction of the results with PyTorch [1]. SpecGAN and WaveGAN have been implemented.

We also experimented with the GANSynth model [2]. All the required helper functions for GANSynth to complete the audio processing are from [3].

[1] JDonahue, C., McAuley, J. and Puckette, M., 2018. Adversarial Audio Synthesis.

[2] Engel J, Agrawal KK, Chen S, Gulrajani I, Donahue C, Roberts A. Gansynth: Adversarial neural audio synthesis. arXiv preprint arXiv:1902.08710. 2019

[3 ] https://github.com/magenta/magenta/tree/main/magenta/models/gansynth/lib

Demo site

Examples of generated audio clips can be found on the demo page: https://ecbme6040.github.io/e6691-2022spring-project-WAVE-an3078-bmh2168-gs3160/

Libraries for SpecGAN & WaveGAN

  • librosa (pip install librosa) (sudo apt-get install libsndfile1)
  • torchaudio (conda install torchaudio -c pytorch)

Download the models

Model weights are ordered by dataset folders (link below)

Drive directory tree
├── WaveGAN/
│   ├── drum/
|   |   ├── examples_samples.pt
│   │   ├── generator.pt
│   │   └── discriminator.pt
│   ├── piano/
|   |   ├── examples_samples.pt
│   │   ├── generator.pt
│   │   └── discriminator.pt
│   └── sc09/
|   |   ├── examples_samples.pt
│   │   ├── generator.pt
│   │   └── discriminator.pt
├── SpecGAN/
│   ├── drum/
|   |   ├── examples_samples.pt
│   │   ├── generator.pt
│   │   └── discriminator.pt
│   ├── piano/
|   |   ├── examples_samples.pt
│   │   ├── generator.pt
│   │   └── discriminator.pt
│   ├── sc09/
|   |   ├── examples_samples.pt
│   │   ├── generator.pt
│   │   └── discriminator.pt
├── GANSynth/
│   ├── NSynth/
|   |   ├── checkpoint
|   |   ├── model.ckpt-11000000.meta
│   │   ├── model.ckpt-11000000.index
│   │   └── model.ckpt-11000000.data-00000-of-00001

Lion drive link (https://drive.google.com/drive/folders/1CPD3boEK5Dw2LmLcUIzUJOnStPdkuBL5?usp=sharing)

Download the data sets

Organization of this directory

'./WaveGan and SpecGAN' folders contain relevant code for the GANSynth model

'./GANSynth' folder contains relevant code for the GANSynth model

'./docs' folder contains website code, and audio examples

│   .gitignore
│   E6691.2022Spring.WAVE.report.an3078.bmh2168.gs3160.pdf
│   README.md
│   │   README.md
│   │
│   └───examples
│       │   README.md
│       │
│       ├───GANSynth
│       │       generated_1.mp3
│       │       generated_2.mp3
│       │       generated_3.mp3
│       │       real_1.mp3
│       │       real_2.mp3
│       │       real_3.mp3
│       │
│       ├───paper
│       │       specgan_drums.mp3
│       │       specgan_piano.mp3
│       │       specgan_sc09.mp3
│       │       wavegan_drums.mp3
│       │       wavegan_piano.mp3
│       │       wavegan_sc09.mp3
│       │
│       ├───specgan
│       │       drum denoised.mp3
│       │       drum.mp3
│       │       piano.mp3
│       │       sc09.mp3
│       │
│       └───wavegan
│               drum n=0.mp3
│               drum n=2.mp3
│               piano.mp3
│               sc09.mp3
│   │   gansynth_generate.py
│   │   gansynth_train.py
│   │   README.md
│   │   __init__.py
│   │
│   ├───configs
│   │       mel_prog_hires.py
│   │       __init__.py
│   │
│   └───lib
│           datasets.py
│           data_helpers.py
│           data_normalizer.py
│           flags.py
│           generate_util.py
│           layers.py
│           model.py
│           networks.py
│           network_functions.py
│           specgrams_helper.py
│           specgrams_helper_test.py
│           spectral_ops.py
│           spectral_ops_test.py
│           train_util.py
│           util.py
│           __init__.py
└───WaveGan and SpecGAN
    │   Generate audio.ipynb
    │   Inception score.ipynb
    │   Inception training.ipynb
    │   README.md
    │   SpecGan Training.ipynb
    │   Wavegan Training.ipynb
        │   generate_show_audio.py
        │   README.md
        │   specgan.py
        │   split_data.py
        │   utils.py
        │   wavegan.py