voice-activity-detection
There are 184 repositories under voice-activity-detection topic.
modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
noisetorch/NoiseTorch
Real-time microphone noise suppression on Linux.
pyannote/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
smacke/ffsubsync
Automagically synchronize subtitles with video.
snakers4/silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
jim-schwoebel/voice_datasets
🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).
BingLingGroup/autosub
Command-line utility to transcribe/translate from video/audio/subtitles to subtitles
ricky0123/vad
Voice activity detector (VAD) for the browser with a simple API
k2-fsa/sherpa-ncnn
Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Linux, macOS, Windows, Raspberry Pi, VisionFive2, LicheePi4A etc.
juanmc2005/diart
A python package to build AI-powered real-time audio applications
TEN-framework/ten-vad
Voice Activity Detection (VAD) : low-latency, high-performance and lightweight
coqui-ai/open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
ggeop/Python-ai-assistant
Python AI assistant 🧠
jtkim-kaist/VAD
Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.
ina-foss/inaSpeechSegmenter
CNN-based audio segmentation toolkit. Allows to detect speech, music, noise and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.
amsehili/auditok
An audio/acoustic activity detection and audio segmentation tool
iamsrikanthnani/pluely
The Open Source Alternative to Cluely - A lightning-fast, privacy-first AI assistant that works seamlessly during meetings, interviews, and conversations without anyone knowing. Built with Tauri for native performance, just 10MB. Completely undetectable in video calls, screen shares, and recordings.
FluidInference/FluidAudio
Native Swift and CoreML SDK for local speaker diarization, VAD, and speech-to-text for real-time workloads. Works on iOS and macOS.
baxtree/subaligner
Automatically synchronize and translate subtitles, or create new ones by transcribing, using pre-trained DNNs, Forced Alignments and Transformers. https://subaligner.readthedocs.io/
shashikg/WhisperS2T
An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine
gkonovalov/android-vad
Android Voice Activity Detection (VAD) library. Supports WebRTC VAD GMM, Silero VAD DNN, Yamnet VAD DNN models.
gtreshchev/RuntimeAudioImporter
Runtime Audio Importer plugin for Unreal Engine. Importing audio of various formats at runtime.
jim-schwoebel/voicebook
🗣️ A book and repo to get you started programming voice computing applications in Python (10 chapters and 200+ scripts).
filippogiruzzi/voice_activity_detection
Voice Activity Detection based on Deep Learning & TensorFlow
Picovoice/cobra
On-device voice activity detection (VAD) powered by deep learning
tomchang25/whisper-auto-transcribe
Auto transcribe tool based on whisper
nicklashansen/voice-activity-detection
Voice Activity Detection (VAD) using deep learning.
eesungkim/Voice_Activity_Detector
A statistical model-based Voice Activity Detection
pmbstyle/Alice
Alice is a smart desktop AI assistant application built with Vue.js, Vite, and Electron. Advanced memory system, function calling, MCP support, optional fully local use, and more.
voithru/voice-activity-detection
Pytorch implementation of SELF-ATTENTIVE VAD, ICASSP 2021
zhenghuatan/rVADfast
This is the Python library for an unsupervised, fast method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsupervised Segment-Based Robust Voice Activity Detection Method.
RicherMans/GPV
Repository for our Interspeech2020 general-purpose voice activity detection (GPVAD) paper
zhenghuatan/rVAD
Matlab and Python libraries for an unsupervised method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsupervised Segment-Based Robust Voice Activity Detection Method.
Speech-Interaction-Technology-Aalto-U/itsp
Introduction to Speech Processing
Ankit-Kumar-Saini/Coursera_Deep_Learning_Specialization
Implementation of Logistic Regression, MLP, CNN, RNN & LSTM from scratch in python. Training of deep learning models for image classification, object detection, and sequence processing (including transformers implementation) in TensorFlow.
RicherMans/Datadriven-GPVAD
The codebase for Data-driven general-purpose voice activity detection.