yluo42/TAC

how to understand the rescaling with snr in create_dataset.py

changxuding opened this issue · 2 comments

Hallo,
I am a little confused about these operations
"spk2 = spk2 / np.sqrt(np.sum(spk22)+1e-8) * 1e2" and
"noise = noise / np.sqrt(np.sum(noise
2)+1e-8) * np.sqrt(np.sum((spk1+spk2)**2)+1e-8)"
in create_dataset.py, could u please explain why multiply "1e2" and "np.sqrt(np.sum((spk1+spk2)**2)+1e-8)" at the end.
thanks

Hi,

The scalar 1e2 is to ensure there won't be underflow issue as the normalized waveforms may have small energy. The rescaling of noise energy is based on the design that the SNR is calculated between the speech mixture and the noise, so the energy of the mixture is first applied to the noise.

Understood, thanks a lot for explaining.