Is it possible to load .npy spectrograms directly?
george-roussos opened this issue · 3 comments
Hi, on the notebook it isn't clear whether you can load your own spectrogram directly (for TTS inference) or not. Is this possible and if so, have you tried it?
Do you mean by "your own spectrogram" a mel-spectrogram from another speaker? Trained WaveGrad can take any mel-spectrogram as input: it should be of type 'torch.Tensor' and it should match STFT parameters, which were used to train WaveGrad. Generally, the output quality can depend on a speaker you are trying to feed the model with. I haven't tested WaveGrad on unseen speakers, but I believe it should perform well.
Thanks! No, I meant feeding WaveGrad a .npy of a TTS trained on the same speaker, instead of running inference on the test set, because I didn't see it anywhere on the notebook (or I just missed it). I guess I pass it as a mel instead of iterating over a batch?
Yeah, if you have your mel of type np.array
saved on a disk, than just load it and convert it into torch.Tensor
. For conversion you can use classical torch.from_numpy(your_numpy_mel)
or just torch.FloatTensor(your_numpy_mel)
. Finally, make sure that your mel has batch dimension, e.g. 1 x n_mels x n_frames
and is on the same device as WaveGrad (CPU or GPU). Then, you can feed it as input to the forward
method.