/fastspeech

A pytorch implementation of the FastSpeech arachitecture

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

fastspeech

Install

pip install -e '.[dev]'

How to use

Further documentation of the modules and how to use the library can be found at: https://ahadjawaid.github.io/fastspeech/

The first step to use the model for inference is to import the model from a trained checkpoint

model, norm = load_model_inference(checkpoint_path)

Next we need to process the text to convert it into something the model can recognize

text = "Hi, my name is ahod and this is a demonstration of my implementation of the fast speech model"
phones = preprocess_text(text, vocab_path)

Then we generate the melspectrogram using the FastSpeech model

mel = bayesian_inference(phones, model, 10)
mel = norm.denormalize(mel)
show_mel(mel)

Lastly we use a vocoder to convert the melspectrogram to a wav file. In this case we are using the Griffin-Lim Algorithm to perform the inverse operation

sf.write(save_path, wav, sr)
Your browser does not support the audio element.