Comment on VAD
donpark opened this issue · 0 comments
donpark commented
I haven't had time to read the original paper but based on some live tests, implementation appears to be much better at detecting end of voice activity than its start. In particular it seems more sensitive to certain starting words like "OK Google" and less sensitive to words like "It".
My current opinion is that, if implementation correctly reflects the algorithm, it's best used only to detect end of speech, using other means like touch/press to start capturing voice which isn't too bad for voice command application but less than ideal for VOIP.
Thoughts?