how to understand the rescaling with snr in create_dataset.py

Question

how to understand the rescaling with snr in create_dataset.py

changxuding opened this issue 4 years ago · 2 comments

Hallo,
I am a little confused about these operations
"spk2 = spk2 / np.sqrt(np.sum(spk22)+1e-8) * 1e2" and
"noise = noise / np.sqrt(np.sum(noise2)+1e-8) * np.sqrt(np.sum((spk1+spk2)**2)+1e-8)"
in create_dataset.py, could u please explain why multiply "1e2" and "np.sqrt(np.sum((spk1+spk2)**2)+1e-8)" at the end.
thanks

Answer 1 · 2021-03-18T12:22:54.000Z

Hi,

The scalar 1e2 is to ensure there won't be underflow issue as the normalized waveforms may have small energy. The rescaling of noise energy is based on the design that the SNR is calculated between the speech mixture and the noise, so the energy of the mixture is first applied to the noise.

Answer 2 · 2021-03-19T09:37:44.000Z

Understood, thanks a lot for explaining.