Uberduck is a tool for fun and creativity with neural text-to-speech. This repository will get you creating your own speech synthesis models. Please see our training and synthesis notebooks, and the Wiki.
The main "Tacotron2" model in this repository is based on the NVIDIA Mellotron. The main reasons to use this repository instead are
- simple fill-populating and rhythm predicting inference
- vocoders!
- more languages
- improved performance in fine tuning using additive covariates
- improved tensorboard logging
- all types of categorical covariates either supported or in progress (multispeaker, torchmoji, signal-to-noise ration, zero shot, pitch support)
- sensibly refactored, production tested code
The notebooks are the easiest ways to try us out.
If you want to install on your own machine, create a virtual environment and install like
conda create -n 'uberduck-ml-dev' python=3.8
source activate uberduck-ml-dev
pip install git+https://github.com/uberduck-ai/uberduck-ml-dev.git
Please see the tests subfolder for examples of up to date training and inference invocation.
We love contributions! Feel free to reach out to discuss contribution.
To install in development mode, run
pip install pre-commit black # format your code on commit by installing black!
git clone git@github.com:uberduck-ai/uberduck-ml-dev.git
cd uberduck-ml-dev
pre-commit install # Install required Git hooks
python setup.py develop # Install the library
In an environment or image with uberduck-ml-dev installed in the uberduck-ml-dev root folder, run
python -m pytest