How to process 1h30m of audio?

Question

How to process 1h30m of audio?

Closed this issue 4 years ago · 2 comments

I'm using the run_example.sh script to get the rttm. It works for most of the files, however the oom_killer kills the process for the following conversation in wav format (ffprobe output):

Duration: 01:21:25.15, bitrate: 256 kb/s
    Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, 1 channels, s16, 256 kb/s

I created a dummy .lab file with a single line 0.000 4885.153375 sp.

How can I process this long file?

Answer 1 · 2021-05-19T15:48:18.000Z

Hi, unless you know that your file has speech in all that segment, you will probably be better using a simple VAD to have a more reasonable .lab file.
Besides that, what takes most of the time when processing very long files is the AHC step. You can try using random initializations. See the option random_5 in vbhmm.py. That will run much faster and probably your job will not get killed.

Answer 2 · 2021-05-19T19:22:43.000Z

Thanks for your help. Using the new .lab file, now both AHC+VB and random_5 successfully finish. AHC+VB took ~1 hour while random_5 ~ 5 min.