(Unofficial) PyTorch implementation of TacoSpawn, Speaker Generation, Stanton et al., 2021.
- Speaker Generation [arXiv:2111.05095]
- Unconditional VLB-TacoSpawn implementation.
Tested in python 3.7.9 ubuntu conda environment, requirements.txt
Download LibriTTS dataset from openslr
To train model, run train.py.
python train.py --data-dir /datasets/LibriTTS/train-clean-360
Or dump the dataset to accelerate the train.
python -m utils.libritts.dump \
--data-dir /datasets/LibriTTS/train-clean-360 \
--output-dir /datasets/LibriTTS/train-clean-360-dump \
--num-proc 8
python train.py \
--data-dir /datasets/libritts/raw-LibriTTS/train-clean-360-dump \
--from-dump
To start to train from previous checkpoint, --load-epoch
is available.
python train.py \
--data-dir /datasets/LibriTTS/train-clean-360-dump \
--from-dump \
--load-epoch 20 \
--config ./ckpt/t1.json
Checkpoint will be written on TrainConfig.ckpt
, tensorboard summary on TrainConfig.log
.
python train.py
tensorboard --logdir ./log
[WIP] inference and pretrained