5000000*87, what does (87-42) mean?

Question

5000000*87, what does (87-42) mean?

maggie0830 opened this issue 4 years ago · 3 comments

As we know, we use the denoise.c get 42 features, but when we run "denoise_training speech.pcm noise.pcm " we get 5000000*87 feature matrix. And including the 42 feature infeature matrix, what dose the (87-42) mean? thanks

Answer 1 · 2021-01-04T08:31:54.000Z

42 features extraction + 22 expected gain +22 noise logarithmic spectrum+1 vad=87， so that we get 5000000*87 matrix.

Answer 2 · 2021-05-29T21:12:32.000Z

There isn't much I can find on training. I want to train for 8K narrow band and interested to train on MS-SNSD dataset combine with various noise dataset shared on demo.

Can you help with following questions?

Do I need to downsample all dataset to 8KHz or I just need to downsample dataset to match with 16KHz samples in MS-SNSD?
signal.raw mentioned in TRAINING-README should contain noise? or I need to use clean audio files provided MS-SNSD clean_train folder as is?
denoise_training takes only one signal.raw and noise.raw. so how can I run it on multiple files since it overrides every time training.f32? do I need to combine all clean audio files in one and noises in another audio file?

Answer 3 · 2021-06-08T02:35:01.000Z

Hi, what is the function of denoise_training file? I can't seem to open it to check out its function. Please, thanks.