stt

There are 404 repositories under stt topic.

  • khoj

    khoj-ai/khoj

    Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (e.g gpt, claude, gemini, llama, qwen, mistral).

    Language:Python15.8k81453780
  • alphacep/vosk-api

    Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

    Language:Jupyter Notebook8.1k1191.5k1.1k
  • snakers4/silero-models

    Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

    Language:Jupyter Notebook5k86131315
  • jianchang512/stt

    Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具,输出json、srt字幕、纯文字格式

    Language:Python2.5k1188277
  • coqui-ai/STT

    🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.

    Language:C++2.3k62183278
  • pannous/tensorflow-speech-recognition

    🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks

    Language:Python2.2k19070638
  • pluja/whishper

    Transcribe any audio to text, translate and edit subtitles 100% locally with a web UI. Powered by whisper models!

    Language:Svelte1.6k2711092
  • coqui-ai/open-speech-corpora

    💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

  • voice-pro

    abus-aikorea/voice-pro

    Gradio WebUI for whisper, faster-whisper, whisper-timestamped. Supports YouTube Downloader, Vocal Remover, Transcription, Text-to-Speech (Edge-TTS, F5-TTS), and Translation.

    Language:Python91410991
  • gp.nvim

    Robitx/gp.nvim

    Gp.nvim (GPT prompt) Neovim AI plugin: ChatGPT sessions & Instructable text/code operations & Speech to text [OpenAI, Ollama, Anthropic, ..]

    Language:Lua8871313476
  • R3gm/SoniTranslate

    Synchronized Translation for Videos. Video dubbing

    Language:Python87617107164
  • Speech-AI-Forge

    lenML/Speech-AI-Forge

    🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.

    Language:Python85713151113
  • snakers4/open_stt

    Open STT

    Language:Python783583981
  • evancohen/sonus

    :speech_balloon: /so.nus/ STT (speech to text) for Node with offline hotword detection

    Language:JavaScript627337679
  • VRCWizard/TTS-Voice-Wizard

    Speech to Text to Speech. Song now playing. Sends text as OSC messages to VRChat to display on avatar. (STTTS) (Speech to TTS) (VRC STT System) (VTuber TTS)

    Language:C#603133568
  • Picovoice/cheetah

    On-device streaming speech-to-text engine powered by deep learning

    Language:Python594348267
  • mkiol/dsnote

    Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.

    Language:C++5811417120
  • bbc/react-transcript-editor

    A React component to make correcting automated transcriptions of audio and video easier and faster. By BBC News Labs. - Work in progress

    Language:JavaScript57134108165
  • lobe-tts

    lobehub/lobe-tts

    🎤 Lobe TTS - A high-quality & reliable TTS/STT library for Server and Browser

    Language:TypeScript4647563
  • make-a-smart-speaker

    voice-engine/make-a-smart-speaker

    A collection of resources to make a smart speaker

  • Macoron/whisper.unity

    Running speech to text model (whisper.cpp) in Unity3d on your local machine.

    Language:C#434145997
  • Picovoice/leopard

    On-device speech-to-text engine powered by deep learning

    Language:Python433185127
  • OpenNewsLabs/autoEdit_2

    Fast text based video editing, node Electron Os X desktop app, with Backbone front end.

    Language:JavaScript421387356
  • StarmoonAI/Starmoon

    An open source voice-enabled, compact, empathic AI hardware + software 🤖 framework for companionship, entertainment, education, pediatric care, IoT robotics applications, AI-enhanced robotics application services, research, and DIY robotics kit development using Python, NextJs, Arduino, ESP32, LLMs (GPT), STT, TTS, Emotion Analysis, AI agent

    Language:TypeScript41341147
  • gia-guar/JARVIS-ChatGPT

    A Conversational Assistant equipped with synthetic voices including J.A.R.V.I.S's. Powered by OpenAI and IBM Watson APIs and a Tacotron model for voice generation.

    Language:Python394211292
  • ccoreilly/vosk-browser

    A speech recognition library running in the browser thanks to a WebAssembly build of Vosk

    Language:JavaScript382196361
  • deepgram-devs/deepgram-ai-agent-demo

    Deepgram Conversational AI demo

    Language:TypeScript34643094
  • NsLearning/LangHelper

    Striving to create a great Application with full functions of learning languages by ChatGPT, TTS, STT and other awesome AI models, supports talking, speaking assessment, memorizing words with contexts, Listening test, so on.

    Language:Rust3275322
  • Open-Speech-EkStep/vakyansh-models

    Open source speech to text models for Indic Languages

  • algolia/voice-overlay-android

    🗣 An overlay that gets your user’s voice permission and input as text in a customizable UI

    Language:Kotlin25511436
  • Ikaros-521/RealtimeSTT_LLM_TTS

    实时STT,连接OpenAI接口/智谱AI(流式LLM)和GPT-SOVITS/Edge-TTS,通过网页的方式,进行跨网络的服务调用,实现实时对话的效果

    Language:Python2532540
  • livekit-examples/kitt

    Talk to ChatGPT in real time using LiveKit

    Language:Go240111868
  • nikdanilov/whisper-obsidian-plugin

    Speech-to-text in Obsidian using OpenAI Whisper

    Language:TypeScript227125430
  • sovaai/sova-asr

    SOVA ASR (Automatic Speech Recognition)

    Language:Python169132421
  • gaborvecsei/whisper-live-transcription

    Live-Transcription (STT) with Whisper PoC

    Language:Python1555422
  • MycroftAI/ZZZ-RETIRED__openstt

    RETIRED - OpenSTT is now retired. If you would like more information on Mycroft AI's open source STT projects, please visit: