xiph/rnnoise

How to train with large dataset

Bach1502 opened this issue · 5 comments

Hello,
I believe that this is a fairly simple question but since I'm very new to ML in general, it still baffles me. I just followed the training instruction and has successfully trained my model on one pair of data (a clean speech.wav and a noise.wav) now I want to ask how can you repeat this process for larger dataset, I'm currently having a set of data with 300 files for these 2 categories and I don't think repeating this process 300 times is the way I should go.

Thanks.

just concatenate the audio files.
But you need to be aware, that the input format is not .wav it's plain pcm without any header.

thank you, I will try it to see if it works

I want to know how to concatenate the audio files. Did you use any useful tools?Or just copy the RAW files and paste them into one file? How can I get a long RAW data? I would be very grateful if you could help me

I wrote a python script to concatenate files. For reading audio files I used the soundfile package and resampled if needed using scipy.

Sorry, but I think your behavior in the GitHub issues is somewhat inappropriate.
You spammed the very same question three times across multiple issues:
#208
#201 (comment)
#196
You can answer your question yourself, by reading the rnnoise paper and newer speech enhancement papers.
They all report numbers on how much data they are using.