Pytorch implementation based on ESPNet (https://github.com/espnet/espnet) toolkit; tag: v.202204
paper: https://arxiv.org/abs/2203.16852
demo: https://imdanboy.github.io/interspeech2022/
JETS consists of FastSpeech2, HiFi-GAN and an Alignment Module. The model files are located at espnet2/gan_tts/jets/
- clone the repo
git clone https://github.com/imdanboy/jets.git
- download espnet and patch jets code to espnet
cd jets; ./patch_to_espnet.sh
- install espnet as usual
cd jets/espnet/tools
./setup_venv $(which python3)
make
- run the training script // tested on 4 V100 GPUs
# LJSPEECH training
cd jets/espnet/egs2/ljspeech/tts1
./run.sh --stage 1 --stop_stage 6 --ngpu 4
# KSS training
cd jets/espnet/egs2/kss/tts1
./run.sh --stage 1 --stop_stage 6 --ngpu 4
JETS is now available in ESPnet officially since v.202205 !!!