Implementation of VAWGAN (conditional VAE-WGAN) for non-parallel voice conversion in PyTorch, re-purposed for emotional voice conversion on SAVEE dataset.
- Python 3.6
- tensorflow >= 1.5.0
- PyTorch >= 0.4.0
- PyWorld
- librosa
- soundfile
# feature extraction
python analyzer.py \
--dir_to_wav dataset/savee/wav \
--dir_to_bin dataset/savee/bin
# collect stats
python build.py
--train_file_pattern "dataset/savee/bin/*/*.bin" \
--corpus_name savee
# training
python train.py architecture-vawgan-savee.json savee
# conversion
python convert.py architecture-vawgan-savee.json savee THap
Adjust model and training parameters in architecture JSON file.
- It can only do 1-to-1 VC.
- Setting
epoch_vawgan
in the architecture to 0 results in 1-to-1 VAE.