gkonovalov/android-vad

Support Silero VAD v5?

Opened this issue · 0 comments

First of all, thank you for your excellent work on this Android VAD library. It has been incredibly useful.

I noticed that the Silero VAD model has been updated to version 5.0 (https://github.com/snakers4/silero-vad/releases/tag/v5.0), which includes several significant improvements and changes:

  • Slightly faster inference

  • Model size increased to 2MB (from 1MB)

  • Support for over 6,000 languages

  • Significantly more robust on noisy data

  • 5-7% quality increase on clean data

  • Quality difference for different window sizes is now negligible

  • Deprecated window_size_samples - now uses fixed size window

  • Works with 8 kHz and 16 kHz sample rates, with fixed 256 and 512 sample windows respectively

  • Changed internal logic to pass context from previous chunk

  • Still supports sample rates that are multiples of 16 kHz

Question

Given these significant improvements and changes, would it be possible to update this library to support the new v5.0 model? The enhancements in performance, language support, and robustness could greatly benefit users of the library.

Offer to Contribute

If updating to v5.0 is not currently on your roadmap, I would be interested in potentially submitting a PR to add this support. However, I would appreciate some guidance on:

  1. The best approach to integrate the new model, considering the changes in window sizes and internal logic
  2. How to handle the deprecation of window_size_samples in the library's API
  3. Any specific areas of the codebase that would need modification to accommodate these changes
  4. Any considerations or challenges you foresee in this update, particularly regarding the increased model size