An auxiliary tool for simplifying speech generation on arbitrary texts
To generate speech for an arbitrary text, use the following command:
python -m rr say 'Привет, мир'
Example of command for generating an audio book from a txt file:
python -m rr say -e silero -a baya -grx -b 10000 -t assets/player-one.txt
By default RuTTS toolkit is used, but you can specify other model using -e
(--engine
) cli argument:
python -m rr say 'Привет, мир' -e bark
Currently the following engines are supported:
- rutts - an economical model only for russian texts;
- bark - multilingual model, requires a lot of gpu;
- salute - adapter to the cloud service from sber, requires environment variable
SALUTE_SPEECH_AUTH
to be set; - crt - adapter to the cloud service from crt, requires environment variables
CRT_USERNAME
,CRT_PASSWORD
,CRT_DOMAIN
to be set; - coqui - multilingual
xtts
model, utilizes a moderate amount of gpu (much less than bark, but works better), works very slowly (270.43189 seconds per anek in average on gtx 1650 vs 3-5 seconds for rutts), requires fileassets/female.wav
which can be downloaded from here and replaced with desired speaker's voice recording). - silero - amazing models for speech generation, which produce audio with good quality in a reasonable amount of time without requiring a lot of resources.
For a full list of available cli options see __main__.py
.
To add background music to speech use overlay
command:
python -m rr overlay "$HOME/Music/sayonara" "$HOME/Downloads/jap.mp3" "$HOME/Music/sayonara-overlay"
here option -v
(--volume
) can be used to adjust volume of the background music. The default value is 0.2.
The app natively supports one specific use-case: it allows to synthesize speech for anecdotes from this kaggle dataset. The command is similar to the examples listed above, to use rutts
model for reading aloud the first 10 anecdotes you can just type:
python -m rr handle-aneks -n 10
Also you can use another model and specify input / output paths:
python -m rr handle-aneks -e bark -s 'assets/anecdotes.tsv' -d 'assets/anecdotes' -n 10
For a full list of available cli options see __main__.py
.
Also, see the exemplary jupyter notebook which is regularly updated.
To create a conda
environment with required dependencies run the following command:
conda env create -f environment.yml
Install the following dependencies manually:
sudo apt-get install libportaudio2
Also you need to clone unofficial mail ru cloud api package
for being able to seamlessly upload generated files to mail ru cloud:
pushd "/home/$USER"
git clone git@github.com:zeionara/carma.git
popd
ln -s "/home/$USER/carma/cloud_mail_api"
To run tests use the following statement:
python -m unittest discover test