tts

There are 3452 repositories under tts topic.

  • Real-Time-Voice-Cloning

    CorentinJ/Real-Time-Voice-Cloning

    Clone a voice in 5 seconds to generate arbitrary speech in real-time

    Language:Python55.7k9431.1k9.2k
  • RVC-Boss/GPT-SoVITS

    1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

    Language:Python50.9k2321.7k5.6k
  • unsloth

    unslothai/unsloth

    Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.

    Language:Python45.5k2622.5k3.7k
  • coqui-ai/TTS

    🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

    Language:Python42.6k3251.2k5.6k
  • 2noise/ChatTTS

    A generative speech model for daily dialogue.

    Language:Python37.8k1946294.1k
  • MockingBird

    babysor/MockingBird

    🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time

    Language:Python36.6k3088885.3k
  • LocalAI

    mudler/LocalAI

    :robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed, P2P inference

    Language:Go35.3k2321.1k2.8k
  • myshell-ai/OpenVoice

    Instant voice cloning by MIT and MyShell. Audio foundation model.

    Language:Python34.4k2433283.8k
  • fishaudio/fish-speech

    SOTA Open Source TTS

    Language:Python22.9k1195461.9k
  • mastra-ai/mastra

    The TypeScript AI agent framework. ⚡ Assistants, RAG, observability. Supports any LLM: GPT-4, Claude, Gemini, Llama.

    Language:TypeScript16.6k492681.1k
  • FunAudioLLM/CosyVoice

    Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

    Language:Python16.4k1151.3k1.8k
  • NVIDIA-NeMo/NeMo

    A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

    Language:Python15.7k2292.8k3.1k
  • pot-desktop

    pot-app/pot-desktop

    🌈一个跨平台的划词翻译和OCR软件 | A cross-platform software for text translation and recognition.

    Language:JavaScript15.3k54897709
  • readest

    readest/readest

    Readest is a modern, feature-rich ebook reader designed for avid readers offering seamless cross-platform access, powerful tools, and an intuitive interface to elevate your reading experience.

    Language:TypeScript12.5k22318670
  • PaddlePaddle/PaddleSpeech

    Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

    Language:Python12.2k1882k1.9k
  • DrewThomasson/ebook2audiobook

    Generate audiobooks from e-books, voice cloning & 1107+ languages!

    Language:Python11.3k50190847
  • rhasspy/piper

    A fast, local neural text to speech system

    Language:C++10k86552817
  • mozilla/TTS

    :robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)

    Language:Jupyter Notebook10k1865661.3k
  • index-tts/index-tts

    An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

    Language:Python9.9k2219948
  • rany2/edge-tts

    Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

    Language:Python9.1k64280850
  • jianchang512/clone-voice

    A sound cloning tool with a web interface, using your voice or any sound to record audio / 一个带web界面的声音克隆工具,使用你的音色或任意声音来录制音频

    Language:Python8.8k43147944
  • fishaudio/Bert-VITS2

    vits2 backbone with multilingual-bert

    Language:Python8.6k5101.2k
  • krillinai/KrillinAI

    A video translation and dubbing tool powered by LLMs, offering 99 language translations and one-click full-process deployment. It can generate content optimized for platforms like YouTube,TikTok, and Shorts. AI视频翻译配音工具,99种语言双向翻译,一键部署全流程,可以生成适配抖音,小红书,哔哩哔哩,视频号,TikTok,Youtube Shorts等形态的内容

    Language:Go8.5k1123679
  • shidahuilang/shuyuan

    阅读书源-香色闺阁+用心读书+源阅+阅读3.0书源+源阅读+爱阅书香+千阅+花火阅读+读不舍手+番茄+喜马拉雅+漫画+听书+书源+IPTV源+IPA巨魔应用=自动更新

    Language:Python8.4k7924488
  • netease-youdao/EmotiVoice

    EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

    Language:Python8.3k71164730
  • Plachtaa/VALL-E-X

    An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

    Language:Python7.9k85156790
  • jaywalnut310/vits

    VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

    Language:Python7.7k542091.4k
  • duixcom/Duix-Mobile

    🚀 全网效果最好的移动端【实时对话数字人】。 支持本地部署、多模态交互(语音、文本、表情),响应速度低于 1.5 秒,适用于直播、教学、客服、金融、政务等对隐私与实时性要求极高的场景。开箱即用,开发者友好。

    Language:C++7.5k1.1k
  • jianchang512/ChatTTS-ui

    一个简单的本地网页界面,使用ChatTTS将文字合成为语音,同时支持对外提供API接口。A simple native web interface that uses ChatTTS to synthesize text into speech, along with support for external API interfaces.

    Language:Python7.3k39251894
  • wukong-robot

    wzpan/wukong-robot

    🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目,支持ChatGPT多轮对话能力,还可能是首个支持脑机交互的开源智能音箱项目。

    Language:Python7k1753051.4k
  • myshell-ai/MeloTTS

    High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

    Language:Python6.8k47258945
  • LokerL/tts-vue

    🎤 微软语音合成工具,使用 Electron + Vue + ElementPlus + Vite 构建。

    Language:TypeScript6k42154868
  • yl4579/StyleTTS2

    StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

    Language:Python6k78224613
  • canopyai/Orpheus-TTS

    Towards Human-Sounding Speech

    Language:Python5.6k5782462
  • santinic/audiblez

    Generate audiobooks from e-books

    Language:Python5.5k1462360
  • snakers4/silero-models

    Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

    Language:Jupyter Notebook5.5k87135346