BUTSpeechFIT/VBx

Create VAD

Closed this issue · 5 comments

I don't know how to create vad, i try use auditok to create vad, but with my vad, DER result quite bad,

Hello,
We have (in a different branch) code for running an energy-based VAD. From the website of auditok, it seems their VAD is based on a similar approach. As they point out in their limitations, this type of VAD works reasonably well on recordings with low background noise. If you are trying to apply it on more challenging data, it makes sense for the quality to be worse. From your message, I do not know what kind of data you are using.
If you want to use an energy-based VAD, I recommend you to explore using different detection thresholds. It might be possible that by tuning it to your application you will be able to obtain better performance. Still, if your data are challenging, you will need to use a different type of VAD.

Hello @fnlandini ,
I have same issue. I am running AMI recipe, and wanna reproduce the DER 18.99%.
I created oracle vad from the rttm they provided, and credited to ES2005a.lab.
But I got DER 18.63. May I know how did you reproduce the DER 18.99% ?

@DTDwind, this issue refers to computing VAD from the signal so I am not sure how this can be the same issue.

We obtained 18.99 on the test set using the parameters that we shared in the corresponding run script and using oracle VAD.
I also do not understand why you mention ES2005a.lab
We have shared the oracle VAD files in the protocol repository. If you used those files, you should have obtained a very similar result. Sometimes, there can be small differences due to different packages (numpy or others). What you obtained is a bit too different but maybe plausible (assuming you did not change any parameters).

@fnlandini Thanks!!! You help me very much! And I am sorry for my poor English >"<

@fnlandini Thanks u!!! I don't know about energy-based VAD. I will try it.