ksw0306/ClariNet

High frequency in the Gaussian IAF?

azraelkuan opened this issue · 6 comments

there is a lot of noise in the high frequency?? do u have any solution?

dhgrs commented

In my ClariNet repository, I found that using only generated means reduces noise.
dhgrs/chainer-ClariNet@5760605

It means that predicting values instead of probability distributions.

@dhgrs although it sounds unbelieveable, i will try it

r9y9 commented

I think one possible reason is that there's no windowing process in STFT.

class STFT(torch.nn.Module):

Without windowing, we will see unexpected high-frequency values in the spectrum domain due to the discontinuity between edges in time domain.

I also test the stft function in the pytorch and a good spectorgram loss is very important.
this is the predicted wav:
image
if we listen carefully, there will be some noise in the background

in waveglow and flowavenet,
i also found some noise like ksw0306/FloWaveNet#1 (comment)
but much smaller than this picture

There is 3 high frequency noise in my synthesis with teacher model. Does anybody meet such kind of issue? Wave file is attached. Thank you.
The generate_428934_0.wav is synthesized wav, and the generate_428934_0_truth.wav is the recorded wav used in training.
wav.zip

Hi - anyone found the solution for high frequency noise?