DelightfulTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2021
Medium series (with memes): Machine Learning Text-To-Speech: Intro, Little Theory and Math
Weights can be found inside the hf space.
DelightfulTTS + UnivNet, 22.05 khz, check hf space demo PeechTTSv22050
DelightfulTTS Weights: epoch=5816-step=390418.ckpt
Univnet Weights: vocoder_pretrained.pt
DelightfulTTS + HifiGAN, 44.1 khz, check hf space demo PeechTTSv44100
DelightfulTTS Weights: epoch=2450-step=183470.ckpt
HifiGAN Weights: epoch=19-step=44480.ckpt
sudo apt install ffmpeg libasound2-dev build-essential espeak-ng -y
Create env from the environment.yml
file:
conda env create -f environment.yml
# After the setup
conda activate tts_framework
# Install torch
pip install --upgrade --force-reinstall --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu121
# Install nemo
pip install nemo_toolkit['all']
# Run demo
python app.py
# live preview server
mkdocs serve
# build a static site from your Markdown files
mkdocs build
python -m unittest discover -v