speech-recognition

There are 5575 repositories under speech-recognition topic.

  • whisper.cpp

    whisper.cpp

    Port of OpenAI's Whisper model in C/C++

    Language:C++43.2k
  • DeepSpeech

    DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

    Language:C++26.6k
  • faster-whisper

    Faster Whisper transcription with CTranslate2

    Language:Python18.1k
  • whisperX

    WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

    Language:Python17.8k
  • leon

    leon

    🧠 Leon is your open-source personal assistant.

    Language:TypeScript16.6k
  • kaldi

    kaldi-asr/kaldi is the official location of the Kaldi project.

    Language:Shell15.1k
  • DeepLearningExamples

    State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.

    Language:Jupyter Notebook14.5k
  • vosk-api

    Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

    Language:Jupyter Notebook13.2k
  • deep-learning-drizzle

    Drench yourself in Deep Learning, Reinforcement Learning, Machine Learning, Computer Vision, and NLP by learning from these exciting lectures!!

    Language:HTML12.7k
  • FunASR

    A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

    Language:Python12.6k
  • PaddleSpeech

    Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

    Language:Python12.2k
  • speechbrain

    A PyTorch-based Speech Toolkit

    Language:Python10.4k
  • espnet

    End-to-End Speech Processing Toolkit

    Language:Python9.5k
  • speech_recognition

    Speech recognition module for Python, supporting several engines and APIs, online and offline.

    Language:Python8.9k
  • openvino

    OpenVINO™ is an open source toolkit for optimizing and deploying AI inference

    Language:C++8.8k
  • ASRT_SpeechRecognition

    A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统

    Language:Python8.2k
  • annyang

    💬 Speech recognition for your site

    Language:JavaScript6.7k
  • SenseVoice

    Multilingual Voice Understanding Model

    Language:Python6.6k
  • wav2letter

    Facebook AI Research's Automatic Speech Recognition Toolkit

    Language:C++6.4k
  • PaddleX

    All-in-One Development Tool based on PaddlePaddle

    Language:Python5.8k
  • silero-models

    Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

    Language:Jupyter Notebook5.5k
  • WhisperKit

    On-device Speech Recognition for Apple Silicon

    Language:Swift5k
  • FunClip

    Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.

    Language:Python5k
  • whisper-diarization

    Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

    Language:Jupyter Notebook5k
  • voice-pro

    voice-pro

    Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.

    Language:Python4.8k
  • wenet

    Production First and Production Ready End-to-End Speech Recognition Toolkit

    Language:Python4.8k
  • whisper-jax

    JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.

    Language:Jupyter Notebook4.6k
  • ml-road

    Machine Learning Resources, Practice and Research

    Language:Python4.5k
  • porcupine

    On-device wake word detection powered by deep learning

    Language:Python4.4k
  • pocketsphinx

    A small speech recognizer

    Language:C4.2k
  • distil-whisper

    Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.

    Language:Python3.9k
  • stt

    Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具,输出json、srt字幕、纯文字格式

    Language:Python3.8k
  • awesome-speech-recognition-speech-synthesis-papers

    Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)

  • whisper-asr-webservice

    OpenAI Whisper ASR Webservice API

    Language:Python2.9k
  • willow

    Open source, local, and self-hosted Amazon Echo/Google Home competitive Voice Assistant alternative

    Language:C2.9k
  • lingvo

    Lingvo

    Language:Python2.9k