tts

There are 2540 repositories under tts topic.

  • Real-Time-Voice-Cloning

    CorentinJ/Real-Time-Voice-Cloning

    Clone a voice in 5 seconds to generate arbitrary speech in real-time

    Language:Python52.8k9421.1k8.8k
  • lobe-chat

    lobehub/lobe-chat

    🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Azure / DeepSeek), Knowledge Base (file upload / knowledge management / RAG ), Multi-Modals (Vision/TTS) and plugin system. One-click FREE deployment of your private ChatGPT/ Claude application.

    Language:TypeScript44.8k2082.3k10.1k
  • RVC-Boss/GPT-SoVITS

    1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

    Language:Python35.9k2131.3k4.1k
  • coqui-ai/TTS

    🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

    Language:Python35.6k2951.1k4.4k
  • MockingBird

    babysor/MockingBird

    🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time

    Language:Python35.3k3088785.2k
  • 2noise/ChatTTS

    A generative speech model for daily dialogue.

    Language:Python32.5k1885583.5k
  • myshell-ai/OpenVoice

    Instant voice cloning by MIT and MyShell.

    Language:Python29.9k2162522.9k
  • LocalAI

    mudler/LocalAI

    :robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed, P2P inference

    Language:Go26.1k1928862k
  • fishaudio/fish-speech

    Brand new TTS solution

    Language:Python14.6k984101.1k
  • NVIDIA/NeMo

    A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

    Language:Python12.2k2072.3k2.5k
  • PaddlePaddle/PaddleSpeech

    Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

    Language:Python11.2k1841.9k1.9k
  • pot-desktop

    pot-app/pot-desktop

    🌈一个跨平台的划词翻译和OCR软件 | A cross-platform software for text translation and recognition.

    Language:JavaScript10.5k41736475
  • mozilla/TTS

    :robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)

    Language:Jupyter Notebook9.4k1865651.3k
  • fishaudio/Bert-VITS2

    vits2 backbone with multilingual-bert

    Language:Python8k4901.1k
  • Plachtaa/VALL-E-X

    An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

    Language:Python7.7k81153764
  • jianchang512/clone-voice

    A sound cloning tool with a web interface, using your voice or any sound to record audio / 一个带web界面的声音克隆工具,使用你的音色或任意声音来录制音频

    Language:Python7.5k38138767
  • netease-youdao/EmotiVoice

    EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

    Language:Python7.5k62155632
  • jaywalnut310/vits

    VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

    Language:Python6.9k542061.3k
  • rhasspy/piper

    A fast, local neural text to speech system

    Language:C++6.6k79474484
  • FunAudioLLM/CosyVoice

    Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

    Language:Python6.4k64528686
  • wukong-robot

    wzpan/wukong-robot

    🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目,支持ChatGPT多轮对话能力,还可能是首个支持脑机交互的开源智能音箱项目。

    Language:Python6.4k1742961.3k
  • rany2/edge-tts

    Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

    Language:Python6.3k49235624
  • jianchang512/ChatTTS-ui

    一个简单的本地网页界面,使用ChatTTS将文字合成为语音,同时支持对外提供API接口。A simple native web interface that uses ChatTTS to synthesize text into speech, along with support for external API interfaces.

    Language:Python6.3k38228740
  • shidahuilang/shuyuan

    阅读书源-香色闺阁+阅读3.0书源+源阅读+爱阅书香+千阅+花火阅读+读不舍手+IPTV源+IPA巨魔应用=自动更新

    Language:HTML5.9k6020354
  • LokerL/tts-vue

    🎤 微软语音合成工具,使用 Electron + Vue + ElementPlus + Vite 构建。

    Language:TypeScript5.8k42151837
  • snakers4/silero-models

    Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

    Language:Jupyter Notebook5k86131315
  • yl4579/StyleTTS2

    StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

    Language:Python5k77197418
  • myshell-ai/MeloTTS

    High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

    Language:Python4.8k40185631
  • MoonInTheRiver/DiffSinger

    DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

    Language:Python4.3k43102717
  • collabora/WhisperSpeech

    An Open Source text-to-speech system built by inverting Whisper.

    Language:Jupyter Notebook4k76112216
  • NexaAI/nexa-sdk

    Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML models. It supports text generation, image generation, vision-language models (VLM), auto-speech-recognition (ASR), and text-to-speech (TTS) capabilities.

    Language:Python4k35237585
  • metavoiceio/metavoice-src

    Foundational model for human-like, expressive TTS

    Language:Python3.9k79127661
  • TensorSpeech/TensorFlowTTS

    :stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

    Language:Python3.8k79686815
  • tts-server-android

    jing332/tts-server-android

    这是一个Android系统TTS应用,内置微软演示接口,可自定义HTTP请求,可导入其他本地TTS引擎,以及根据中文双引号的简单旁白/对话识别朗读 ,还有自动重试,备用配置,文本替换等更多功能。

    Language:Kotlin3.4k240287
  • Ikaros-521/AI-Vtuber

    AI Vtuber是一个由 【ChatterBot/ChatGPT/claude/langchain/chatglm/text-gen-webui/闻达/千问/kimi/ollama】 驱动的虚拟主播【Live2D/UE/xuniren】,可以在 【Bilibili/抖音/快手/微信视频号/拼多多/斗鱼/YouTube/twitch/TikTok】 直播中与观众实时互动 或 直接在本地进行聊天。它使用TTS技术【edge-tts/VITS/elevenlabs/bark/bert-vits2/睿声】生成回答并可以选择【so-vits-svc/DDSP-SVC】变声;指令协同SD画图。

    Language:Python3.1k30177477
  • zzw922cn/awesome-speech-recognition-speech-synthesis-papers

    Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)