This is an implementation of the WaveNet architecture, as described in the original paper.
- Automatic creation of a dataset (training and validation/test set) from all sound files (.wav, .aiff, .mp3) in a directory
- Efficient multithreaded data loading
- Logging to TensorBoard (Training loss, validation loss, validation accuracy, parameter and gradient histograms, generated samples)
- Fast generation, as introduced here
- python 3
- pytorch 0.3
- numpy
- librosa
- jupyter
- tensorflow for TensorBoard logging
For an introduction on how to use this model, take a look at the WaveNet demo notebook. You can find audio clips generated by a simple trained model in the generated samples directory