tts
There are 2540 repositories under tts topic.
CorentinJ/Real-Time-Voice-Cloning
Clone a voice in 5 seconds to generate arbitrary speech in real-time
lobehub/lobe-chat
🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Azure / DeepSeek), Knowledge Base (file upload / knowledge management / RAG ), Multi-Modals (Vision/TTS) and plugin system. One-click FREE deployment of your private ChatGPT/ Claude application.
RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
babysor/MockingBird
🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
2noise/ChatTTS
A generative speech model for daily dialogue.
myshell-ai/OpenVoice
Instant voice cloning by MIT and MyShell.
mudler/LocalAI
:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed, P2P inference
fishaudio/fish-speech
Brand new TTS solution
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
PaddlePaddle/PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
pot-app/pot-desktop
🌈一个跨平台的划词翻译和OCR软件 | A cross-platform software for text translation and recognition.
mozilla/TTS
:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
fishaudio/Bert-VITS2
vits2 backbone with multilingual-bert
Plachtaa/VALL-E-X
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
jianchang512/clone-voice
A sound cloning tool with a web interface, using your voice or any sound to record audio / 一个带web界面的声音克隆工具,使用你的音色或任意声音来录制音频
netease-youdao/EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
jaywalnut310/vits
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
rhasspy/piper
A fast, local neural text to speech system
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
wzpan/wukong-robot
🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目,支持ChatGPT多轮对话能力,还可能是首个支持脑机交互的开源智能音箱项目。
rany2/edge-tts
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
jianchang512/ChatTTS-ui
一个简单的本地网页界面,使用ChatTTS将文字合成为语音,同时支持对外提供API接口。A simple native web interface that uses ChatTTS to synthesize text into speech, along with support for external API interfaces.
shidahuilang/shuyuan
阅读书源-香色闺阁+阅读3.0书源+源阅读+爱阅书香+千阅+花火阅读+读不舍手+IPTV源+IPA巨魔应用=自动更新
LokerL/tts-vue
🎤 微软语音合成工具,使用 Electron + Vue + ElementPlus + Vite 构建。
snakers4/silero-models
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
yl4579/StyleTTS2
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
myshell-ai/MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
MoonInTheRiver/DiffSinger
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
collabora/WhisperSpeech
An Open Source text-to-speech system built by inverting Whisper.
NexaAI/nexa-sdk
Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML models. It supports text generation, image generation, vision-language models (VLM), auto-speech-recognition (ASR), and text-to-speech (TTS) capabilities.
metavoiceio/metavoice-src
Foundational model for human-like, expressive TTS
TensorSpeech/TensorFlowTTS
:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
jing332/tts-server-android
这是一个Android系统TTS应用,内置微软演示接口,可自定义HTTP请求,可导入其他本地TTS引擎,以及根据中文双引号的简单旁白/对话识别朗读 ,还有自动重试,备用配置,文本替换等更多功能。
Ikaros-521/AI-Vtuber
AI Vtuber是一个由 【ChatterBot/ChatGPT/claude/langchain/chatglm/text-gen-webui/闻达/千问/kimi/ollama】 驱动的虚拟主播【Live2D/UE/xuniren】,可以在 【Bilibili/抖音/快手/微信视频号/拼多多/斗鱼/YouTube/twitch/TikTok】 直播中与观众实时互动 或 直接在本地进行聊天。它使用TTS技术【edge-tts/VITS/elevenlabs/bark/bert-vits2/睿声】生成回答并可以选择【so-vits-svc/DDSP-SVC】变声;指令协同SD画图。
zzw922cn/awesome-speech-recognition-speech-synthesis-papers
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)