Is it possible to load .npy spectrograms directly?

Question

Is it possible to load .npy spectrograms directly?

george-roussos opened this issue 4 years ago · 3 comments

Hi, on the notebook it isn't clear whether you can load your own spectrogram directly (for TTS inference) or not. Is this possible and if so, have you tried it?

Answer 1 · 2020-09-24T09:09:45.000Z

Do you mean by "your own spectrogram" a mel-spectrogram from another speaker? Trained WaveGrad can take any mel-spectrogram as input: it should be of type 'torch.Tensor' and it should match STFT parameters, which were used to train WaveGrad. Generally, the output quality can depend on a speaker you are trying to feed the model with. I haven't tested WaveGrad on unseen speakers, but I believe it should perform well.

Answer 2 · 2020-09-24T12:19:12.000Z

Thanks! No, I meant feeding WaveGrad a .npy of a TTS trained on the same speaker, instead of running inference on the test set, because I didn't see it anywhere on the notebook (or I just missed it). I guess I pass it as a mel instead of iterating over a batch?

Answer 3 · 2020-09-24T12:34:09.000Z

Yeah, if you have your mel of type np.array saved on a disk, than just load it and convert it into torch.Tensor. For conversion you can use classical torch.from_numpy(your_numpy_mel) or just torch.FloatTensor(your_numpy_mel). Finally, make sure that your mel has batch dimension, e.g. 1 x n_mels x n_frames and is on the same device as WaveGrad (CPU or GPU). Then, you can feed it as input to the forward method.