text-to-speech

There are 4190 repositories under text-to-speech topic.

RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Language:Python50.9k 232 1.7k5.6k
unslothai/unsloth
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.
Language:Python45.5k 262 2.5k3.7k
coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Language:Python42.6k 325 1.2k5.6k
2noise/ChatTTS
A generative speech model for daily dialogue.
Language:Python37.8k 194 6294.1k
babysor/MockingBird
🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
Language:Python36.6k 308 8885.3k
myshell-ai/OpenVoice
Instant voice cloning by MIT and MyShell. Audio foundation model.
Language:Python34.4k 243 3283.8k
nari-labs/dia
A TTS model capable of generating ultra-realistic dialogue in one pass.
Language:Python18.4k1.6k
leon-ai/leon
🧠 Leon is your open-source personal assistant.
Language:TypeScript16.6k 276 3061.4k
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Language:Python16.4k 115 1.3k1.8k
jianchang512/pyvideotrans
Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言，同时支持语音识别转录、语音合成、字幕翻译。
Language:Python14.2k 90 8031.6k
rhasspy/piper
A fast, local neural text to speech system
Language:C++10k 86 552818
mozilla/TTS
:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
Language:Jupyter Notebook10k 186 5661.3k
index-tts/index-tts
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
Language:Python9.9k 22 19948
espnet/espnet
End-to-End Speech Processing Toolkit
Language:Python9.5k 168 2.5k2.3k
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
Language:Python9.4k 84 290757
rany2/edge-tts
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
Language:Python9.1k 64 280850
netease-youdao/EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
Language:Python8.3k 71 164730
Plachtaa/VALL-E-X
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
Language:Python7.9k 85 156790
jaywalnut310/vits
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
Language:Python7.7k 54 2091.4k
k2-fsa/sherpa-onnx
Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, support 12 programming languages
Language:C++7.4k 87 1k867
myshell-ai/MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
Language:Python6.8k 47 258945
yl4579/StyleTTS2
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Language:Python6k 78 224613
espeak-ng/espeak-ng
eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
Language:C5.6k 109 1.1k1.1k
snakers4/silero-models
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
Language:Jupyter Notebook5.5k 87 135346
promptslab/Awesome-Prompt-Engineering
This repository contains a hand-curated resources for Prompt Engineering with a focus on Generative Pre-trained Transformer (GPT), ChatGPT, PaLM etc
Language:Python4.9k 78 1488
abus-aikorea/voice-pro
Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.
Language:Python4.8k 35 45412
MoonInTheRiver/DiffSinger
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
Language:Python4.6k 42 104770
gradio-app/fastrtc
The python library for real-time communication
Language:JavaScript4.3k 38 190394
metavoiceio/metavoice-src
Foundational model for human-like, expressive TTS
Language:Python4.2k 83 129694
TensorSpeech/TensorFlowTTS
:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
Language:Python4k 79 688813
denizsafak/abogen
Generate audiobooks from EPUBs, PDFs and text with synchronized captions.
Language:Python3.6k187
KoljaB/RealtimeTTS
Converts text to speech in realtime
Language:Python3.5k 38 202342
collabora/WhisperLive
A nearly-live implementation of OpenAI's Whisper.
Language:Python3.4k 41 241463
enhuiz/vall-e
An unofficial PyTorch implementation of the audio LM VALL-E
Language:Python3k 87 98413
Camb-ai/MARS5-TTS
MARS5 speech model (TTS) from CAMB.AI
Language:Jupyter Notebook2.8k 36 51246
readbeyond/aeneas
aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
Language:Python2.7k 71 215264

text-to-speech

RVC-Boss/GPT-SoVITS

unslothai/unsloth

coqui-ai/TTS

2noise/ChatTTS

babysor/MockingBird

myshell-ai/OpenVoice

nari-labs/dia

leon-ai/leon

FunAudioLLM/CosyVoice

jianchang512/pyvideotrans

rhasspy/piper

mozilla/TTS

index-tts/index-tts

espnet/espnet

open-mmlab/Amphion

rany2/edge-tts

netease-youdao/EmotiVoice

Plachtaa/VALL-E-X

jaywalnut310/vits

k2-fsa/sherpa-onnx

myshell-ai/MeloTTS

yl4579/StyleTTS2

espeak-ng/espeak-ng

snakers4/silero-models

promptslab/Awesome-Prompt-Engineering

abus-aikorea/voice-pro

MoonInTheRiver/DiffSinger

gradio-app/fastrtc

metavoiceio/metavoice-src

TensorSpeech/TensorFlowTTS

denizsafak/abogen

KoljaB/RealtimeTTS

collabora/WhisperLive

enhuiz/vall-e

Camb-ai/MARS5-TTS

readbeyond/aeneas