How to process 1h30m of audio?
Closed this issue · 2 comments
I'm using the run_example.sh
script to get the rttm
. It works for most of the files, however the oom_killer
kills the process for the following conversation in wav format (ffprobe output):
Duration: 01:21:25.15, bitrate: 256 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, 1 channels, s16, 256 kb/s
I created a dummy .lab
file with a single line 0.000 4885.153375 sp
.
How can I process this long file?
Hi, unless you know that your file has speech in all that segment, you will probably be better using a simple VAD to have a more reasonable .lab file.
Besides that, what takes most of the time when processing very long files is the AHC step. You can try using random initializations. See the option random_5 in vbhmm.py. That will run much faster and probably your job will not get killed.
Thanks for your help. Using the new .lab
file, now both AHC+VB
and random_5
successfully finish. AHC+VB
took ~1 hour while random_5
~ 5 min.