text-to-speech

There are 4190 repositories under text-to-speech topic.

  • RVC-Boss/GPT-SoVITS

    1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

    Language:Python50.9k2321.7k5.6k
  • unsloth

    unslothai/unsloth

    Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.

    Language:Python45.5k2622.5k3.7k
  • coqui-ai/TTS

    🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

    Language:Python42.6k3251.2k5.6k
  • 2noise/ChatTTS

    A generative speech model for daily dialogue.

    Language:Python37.8k1946294.1k
  • MockingBird

    babysor/MockingBird

    🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time

    Language:Python36.6k3088885.3k
  • myshell-ai/OpenVoice

    Instant voice cloning by MIT and MyShell. Audio foundation model.

    Language:Python34.4k2433283.8k
  • dia

    nari-labs/dia

    A TTS model capable of generating ultra-realistic dialogue in one pass.

    Language:Python18.4k1.6k
  • leon

    leon-ai/leon

    🧠 Leon is your open-source personal assistant.

    Language:TypeScript16.6k2763061.4k
  • FunAudioLLM/CosyVoice

    Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

    Language:Python16.4k1151.3k1.8k
  • jianchang512/pyvideotrans

    Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,同时支持语音识别转录、语音合成、字幕翻译。

    Language:Python14.2k908031.6k
  • rhasspy/piper

    A fast, local neural text to speech system

    Language:C++10k86552818
  • mozilla/TTS

    :robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)

    Language:Jupyter Notebook10k1865661.3k
  • index-tts/index-tts

    An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

    Language:Python9.9k2219948
  • espnet/espnet

    End-to-End Speech Processing Toolkit

    Language:Python9.5k1682.5k2.3k
  • Amphion

    open-mmlab/Amphion

    Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

    Language:Python9.4k84290757
  • rany2/edge-tts

    Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

    Language:Python9.1k64280850
  • netease-youdao/EmotiVoice

    EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

    Language:Python8.3k71164730
  • Plachtaa/VALL-E-X

    An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

    Language:Python7.9k85156790
  • jaywalnut310/vits

    VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

    Language:Python7.7k542091.4k
  • k2-fsa/sherpa-onnx

    Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, support 12 programming languages

    Language:C++7.4k871k867
  • myshell-ai/MeloTTS

    High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

    Language:Python6.8k47258945
  • yl4579/StyleTTS2

    StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

    Language:Python6k78224613
  • espeak-ng/espeak-ng

    eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.

    Language:C5.6k1091.1k1.1k
  • snakers4/silero-models

    Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

    Language:Jupyter Notebook5.5k87135346
  • promptslab/Awesome-Prompt-Engineering

    This repository contains a hand-curated resources for Prompt Engineering with a focus on Generative Pre-trained Transformer (GPT), ChatGPT, PaLM etc

    Language:Python4.9k781488
  • voice-pro

    abus-aikorea/voice-pro

    Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.

    Language:Python4.8k3545412
  • MoonInTheRiver/DiffSinger

    DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

    Language:Python4.6k42104770
  • gradio-app/fastrtc

    The python library for real-time communication

    Language:JavaScript4.3k38190394
  • metavoiceio/metavoice-src

    Foundational model for human-like, expressive TTS

    Language:Python4.2k83129694
  • TensorSpeech/TensorFlowTTS

    :stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

    Language:Python4k79688813
  • denizsafak/abogen

    Generate audiobooks from EPUBs, PDFs and text with synchronized captions.

    Language:Python3.6k187
  • KoljaB/RealtimeTTS

    Converts text to speech in realtime

    Language:Python3.5k38202342
  • collabora/WhisperLive

    A nearly-live implementation of OpenAI's Whisper.

    Language:Python3.4k41241463
  • enhuiz/vall-e

    An unofficial PyTorch implementation of the audio LM VALL-E

    Language:Python3k8798413
  • Camb-ai/MARS5-TTS

    MARS5 speech model (TTS) from CAMB.AI

    Language:Jupyter Notebook2.8k3651246
  • readbeyond/aeneas

    aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)

    Language:Python2.7k71215264