Xiaobin-Rong/deepvqe

Why the input tensor shape is (B,F,T,2)?

Opened this issue · 2 comments

Hi, Xiaobin, thanks for implementation for 'DeepVQE'.I'v read the paper and your codes.But I have some questions:
Why the input tensor shape is (B,F,T,2)?
What does 'B','F','T' mean?
Thanks in advance.

Sorry for the confusion caused by my uncertainty. The input tensor is a batch of noisy spectrograms, where B means the batch size, and F and T refer to frequency bins and time frames, respectively. The final dimension is composed of the real and imaginary parts of the spectrogram.

Sorry for the confusion caused by my uncertainty. The input tensor is a batch of noisy spectrograms, where B means the batch size, and F and T refer to frequency bins and time frames, respectively. The final dimension is composed of the real and imaginary parts of the spectrogram.

Thanks!