Confusion about the caculation of speech covariance matrix
vBaiCai opened this issue · 2 comments
vBaiCai commented
FYJNEVERFOLLOWS commented
All covariance matrices are computed among all 6 channels, thus getting T * F * C * C matrices containing spatial information among channels. The author made a mistake in the paper. Y
and S
denote the 6-channel complex spectrum of mixture and estimated speech, respectively. He was meant to refer to Mag
as the magnitude of the first channel's mixture spectrum. Sorry for his sloppiness. Thank you for pointing out.
Our revised version of paper can be found later on arxiv: https://arxiv.org/abs/2207.07307
vBaiCai commented
Hi, FYJNEVERFOLLOWS,
Now I can understand. The C * C spatial covariance is reasonable.
Thanks for your quick reply!