ksw0306/FloWaveNet

Why is the audio corresponding to the mel feature needed when synthesizing?

NewEricWang opened this issue · 3 comments

I only want to input the mel-feature generated from tacotron2. How should I modify the script "synthesize.py"?

Audio size is required and not the audio. You can just replace a line of code ( in def synthesize in synthesize.py ) code with below modified code.

current code -
q_0 = Normal(x.new_zeros(x.size()), x.new_ones(x.size()))

replace the above by
q_0 = Normal(c.new_zeros(1,1,c.size()[2]*256), c.new_ones(1,1,c.size()[2]*256))

256 is the hop_length from preprocessing.py !

@anupam456 ,Thank you! It works.

Hi @NewEricWang, Which tacotron2 project do you use?