Text2Mel input to WaveGlow outputs noisy audio file without any speech
Opened this issue · 1 comments
deepconsc commented
I've retrained the text2mel model (Described in [https://arxiv.org/pdf/1710.08969.pdf]), by cutting out mel reduction part in preprocessor, and changing the hparams to:
hop_length = 256
win_length = 1024
max_N = 180 # Maximum number of characters.
max_T = 210 # Maximum number of mel frames.
e = 512 # embedding dimension
d = 256 # Text2Mel hidden unit dimension
I'm trying to feed generated mels to WaveGlow, but output audio file is just noisy honk.
Any ideas?
rafaelvalle commented
Make sure the mel-spectrogram preprocessing match the one used in Waveglow.