Silence detection (VAD)

Question

Silence detection (VAD)

matthewkperez opened this issue 5 years ago · 1 comments

Hello, I currently want to train a VAD alongside the DNN senone classifier. My current thought is to create a custom loss function in which all the pdf-ids which map to silences are read as 0s and all the non-silence pdf-ids are real as 1s (for binary classification for the vad).

Is there a way to get the corresponding phone for each pdf-id?

Answer 1 · 2020-04-23T12:37:46.000Z

Hi,
Sorry for the late reply, we are truly busy with the new project, SpeechBrain.

This is definitely not a feature that can be extracted from what we implemented in Pytorch-Kaldi. What I mean by this is that you might need to do some Kaldi call to obtain the corresponding phoneme w.r.t to a pdf-id, although you should ask this in the kaldi google-group to obtain the fastest solution (Since it might affect the training time).