About • Installation • How To Use • Final results • Credits • License
This repository contains the implementation of HiFiGAN vocoder.
See the task assignment here.
See WandB report with implementation details and audio analysis.
Follow these steps to install the project:
-
(Optional) Create and activate new environment using
conda
.# create env conda create -n nv python=3.11 # activate env conda activate nv
-
Install all required packages.
pip install -r requirements.txt
-
Download model checkpoint.
python download_weights.py
-
If you only want to synthesize one text/phrase and save it, run the following command:
python synthesize.py 'text="YOUR_TEXT"' save_path=SAVE_PATH
where
SAVE_PATH
is a path to save synthesize audio. Please be careful in quotes. -
If you want to synthesize audio from text files, your directory with text should has the following format:
NameOfTheDirectoryWithUtterances └── transcriptions ├── UtteranceID1.txt ├── UtteranceID2.txt . . . └── UtteranceIDn.txt
Run the following command:
python synthesize.py dir_path=DIR_PATH save_path=SAVE_PATH
where
DIR_PATH
is directory with text andSAVE_PATH
is a path to save synthesize audio.
To reproduce this model, run the following command:
python train.py
It takes around 3 days to train model from scratch on A100 GPU.
-
Mihajlo Pupin was a founding member of National Advisory Committee for Aeronautics (NACA) on 3 March 1915, which later became NASA, and he participated in the founding of American Mathematical Society and American Physical Society.
id4.mp4
-
Leonard Bernstein was an American conductor, composer, pianist, music educator, author, and humanitarian. Considered to be one of the most important conductors of his time, he was the first American-born conductor to receive international acclaim.
id5.mp4
-
Lev Termen, better known as Leon Theremin was a Russian inventor, most famous for his invention of the theremin, one of the first electronic musical instruments and the first to be mass-produced.
id3.mp4
-
Deep Learning in Audio course at HSE University offers an exciting and challenging exploration of cutting-edge techniques in audio processing, from speech recognition to music analysis. With complex homeworks that push students to apply theory to real-world problems, it provides a hands-on, rigorous learning experience that is both demanding and rewarding.
id1.mp4
-
Dmitri Shostakovich was a Soviet-era Russian composer and pianist who became internationally known after the premiere of his First Symphony in 1926 and thereafter was regarded as a major composer.
id2.mp4
WV-MOS=3.47
using text and WV-MOS=2.15
using MelSpecs.
This repository is based on a PyTorch Project Template.