5000000*87, what does (87-42) mean?
maggie0830 opened this issue · 3 comments
maggie0830 commented
As we know, we use the denoise.c get 42 features, but when we run "denoise_training speech.pcm noise.pcm " we get 5000000*87 feature matrix. And including the 42 feature infeature matrix, what dose the (87-42) mean? thanks
maggie0830 commented
42 features extraction + 22 expected gain +22 noise logarithmic spectrum+1 vad=87, so that we get 5000000*87 matrix.
mysteryjeans commented
There isn't much I can find on training. I want to train for 8K narrow band and interested to train on MS-SNSD dataset combine with various noise dataset shared on demo.
Can you help with following questions?
- Do I need to downsample all dataset to 8KHz or I just need to downsample dataset to match with 16KHz samples in MS-SNSD?
- signal.raw mentioned in TRAINING-README should contain noise? or I need to use clean audio files provided MS-SNSD clean_train folder as is?
- denoise_training takes only one signal.raw and noise.raw. so how can I run it on multiple files since it overrides every time training.f32? do I need to combine all clean audio files in one and noises in another audio file?
RXAldreezee commented
Hi, what is the function of denoise_training file? I can't seem to open it to check out its function. Please, thanks.