speech-recognition

There are 5136 repositories under speech-recognition topic.

huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Language:Python142k 1.1k 16.9k28.4k
ggerganov/whisper.cpp
Port of OpenAI's Whisper model in C/C++
Language:C++38.8k 327 1.6k4.1k
mozilla/DeepSpeech
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
Language:C++26.1k 676 2.1k4k
leon-ai/leon
🧠 Leon is your open-source personal assistant.
Language:TypeScript16.1k 271 3031.3k
SYSTRAN/faster-whisper
Faster Whisper transcription with CTranslate2
Language:Python15.1k 135 8161.3k
kaldi-asr/kaldi
kaldi-asr/kaldi is the official location of the Kaldi project.
Language:Shell14.7k 696 1.7k5.3k
m-bain/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Language:Python14.7k 145 7921.6k
NVIDIA/DeepLearningExamples
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
Language:Jupyter Notebook14.1k 296 8583.3k
kmario23/deep-learning-drizzle
Drench yourself in Deep Learning, Reinforcement Learning, Machine Learning, Computer Vision, and NLP by learning from these exciting lectures!!
Language:HTML12.5k 604 443k
PaddlePaddle/PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
Language:Python11.7k 187 2k1.9k
speechbrain/speechbrain
A PyTorch-based Speech Toolkit
Language:Python9.6k 134 1.1k1.5k
modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Language:Python9.2k 75 1.4k935
alphacep/vosk-api
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Language:Jupyter Notebook9.1k 120 1.6k1.2k
espnet/espnet
End-to-End Speech Processing Toolkit
Language:Python8.9k 174 2.4k2.2k
Uberi/speech_recognition
Speech recognition module for Python, supporting several engines and APIs, online and offline.
Language:Python8.7k 275 6232.4k
nl8590687/ASRT_SpeechRecognition
A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
Language:Python8.1k 181 2931.9k
openvinotoolkit/openvino
OpenVINO™ is an open source toolkit for optimizing and deploying AI inference
Language:C++8k 193 2.9k2.5k
TalAter/annyang
💬 Speech recognition for your site
Language:JavaScript6.7k 238 3431k
flashlight/wav2letter
Facebook AI Research's Automatic Speech Recognition Toolkit
Language:C++6.4k 245 9251k
snakers4/silero-models
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
Language:Jupyter Notebook5.2k 85 131331
FunAudioLLM/SenseVoice
Multilingual Voice Understanding Model
Language:Python5.1k 51 186466
sanchit-gandhi/whisper-jax
JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.
Language:Jupyter Notebook4.6k 45 183397
argmaxinc/WhisperKit
On-device Speech Recognition for Apple Silicon
Language:Swift4.4k 43 181375
wenet-e2e/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
Language:Python4.4k 91 1.1k1.1k
modelscope/FunClip
Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.
Language:Python4.3k 41 108493
MahmoudAshraf97/whisper-diarization
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
Language:Jupyter Notebook4.3k 46 242394
cmusphinx/pocketsphinx
A small speech recognizer
Language:C4.1k 159 275725
Picovoice/porcupine
On-device wake word detection powered by deep learning
Language:Python4k 65 567514
yanshengjia/ml-road
Machine Learning Resources, Practice and Research
Language:Python4k 160 11.5k
huggingface/distil-whisper
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
Language:Python3.8k 68 108316
abus-aikorea/voice-pro
Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.
Language:Python3.5k 24 38265
jianchang512/stt
Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具，输出json、srt字幕、纯文字格式
Language:Python3.2k 15 111349
zzw922cn/awesome-speech-recognition-speech-synthesis-papers
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
3k 186 7514
zzw922cn/Automatic_Speech_Recognition
End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
Language:Python2.8k 145 90534
tensorflow/lingvo
Lingvo
Language:Python2.8k 117 254448
toverainc/willow
Open source, local, and self-hosted Amazon Echo/Google Home competitive Voice Assistant alternative
Language:C2.7k 42 160104

speech-recognition

huggingface/transformers

ggerganov/whisper.cpp

mozilla/DeepSpeech

leon-ai/leon

SYSTRAN/faster-whisper

kaldi-asr/kaldi

m-bain/whisperX

NVIDIA/DeepLearningExamples

kmario23/deep-learning-drizzle

PaddlePaddle/PaddleSpeech

speechbrain/speechbrain

modelscope/FunASR

alphacep/vosk-api

espnet/espnet

Uberi/speech_recognition

nl8590687/ASRT_SpeechRecognition

openvinotoolkit/openvino

TalAter/annyang

flashlight/wav2letter

snakers4/silero-models

FunAudioLLM/SenseVoice

sanchit-gandhi/whisper-jax

argmaxinc/WhisperKit

wenet-e2e/wenet

modelscope/FunClip

MahmoudAshraf97/whisper-diarization

cmusphinx/pocketsphinx

Picovoice/porcupine

yanshengjia/ml-road

huggingface/distil-whisper

abus-aikorea/voice-pro

jianchang512/stt

zzw922cn/awesome-speech-recognition-speech-synthesis-papers

zzw922cn/Automatic_Speech_Recognition

tensorflow/lingvo

toverainc/willow