(PyTorch Version) Implementation of Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus.
See requirements in requirement.txt:
- linux
- python 3.6
- pytorch 1.0+
- librosa
- json, tqdm, logging
- 1115: update checkpoint
- 1026: upload code
- 1024: implement multi-singer & perceptual loss
- 1023: implement singer encoder
- Put any wav files in data directory
- Edit configuration in config/config.yaml
Pretrain the Singer Embedding Extractor using repository here, and set the 'enc_model_fpath' in config/config.yaml
Extract mel-spectrogram
python preprocess.py -i data/wavs -o data/feature -c config/config.yaml
-i
your audio folder
-o
output acoustic feature folder
-c
config file
Training conditioned on mel-spectrogram
python train.py -i data/feature -o checkpoints/ --config config/config.yaml
-i
acoustic feature folder
-o
directory to save checkpoints
-c
config file
python inference.py -i data/feature -o outputs/ -c checkpoints/*.pkl -g config/config.yaml
-i
acoustic feature folder
-o
directory to save generated speech
-c
checkpoints file
-c
config file
For Singing Voice Synthesis:
- Take modified FastSpeech 2 for mel-spectrogram synthesis
- Use synthesized mel-spectrogram in Multi-Singer for waveform synthesis.
https://drive.google.com/file/d/18rrYmLSrr1CepCTy2lO_NoKlh2jUUoRR/view?usp=sharing
GE2E
FastSpeech 2
Parallel WaveGAN
Please cite this repository by the "Cite this repository" of About section (top right of the main page).
For paper:
@inproceedings{huang2021multi,
title={Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus},
author={Huang, Rongjie and Chen, Feiyang and Ren, Yi and Liu, Jinglin and Cui, Chenye and Zhao, Zhou},
booktitle={Proceedings of the 29th ACM International Conference on Multimedia},
pages={3945--3954},
year={2021}
}
Feel free to contact me at carseallenawm9@gmail.com