text-to-speech
There are 3595 repositories under text-to-speech topic.
RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
babysor/MockingBird
🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
2noise/ChatTTS
A generative speech model for daily dialogue.
myshell-ai/OpenVoice
Instant voice cloning by MIT and MyShell. Audio foundation model.
leon-ai/leon
🧠 Leon is your open-source personal assistant.
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
jianchang512/pyvideotrans
Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,同时支持语音识别转录、语音合成、字幕翻译。
mozilla/TTS
:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
espnet/espnet
End-to-End Speech Processing Toolkit
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
rhasspy/piper
A fast, local neural text to speech system
Plachtaa/VALL-E-X
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
rany2/edge-tts
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
netease-youdao/EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
jaywalnut310/vits
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
myshell-ai/MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
yl4579/StyleTTS2
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
k2-fsa/sherpa-onnx
Speech-to-text, text-to-speech, speaker diarization, speech enhancement, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, support 11 programming languages
snakers4/silero-models
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
espeak-ng/espeak-ng
eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
MoonInTheRiver/DiffSinger
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
promptslab/Awesome-Prompt-Engineering
This repository contains a hand-curated resources for Prompt Engineering with a focus on Generative Pre-trained Transformer (GPT), ChatGPT, PaLM etc
metavoiceio/metavoice-src
Foundational model for human-like, expressive TTS
TensorSpeech/TensorFlowTTS
:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
abus-aikorea/voice-pro
Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.
freddyaboulton/fastrtc
The python library for real-time communication
enhuiz/vall-e
An unofficial PyTorch implementation of the audio LM VALL-E
KoljaB/RealtimeTTS
Converts text to speech in realtime
Camb-ai/MARS5-TTS
MARS5 speech model (TTS) from CAMB.AI
collabora/WhisperLive
A nearly-live implementation of OpenAI's Whisper.
readbeyond/aeneas
aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
elevenlabs/elevenlabs-python
The official Python API for ElevenLabs Text to Speech.
marytts/marytts
MARY TTS -- an open-source, multilingual text-to-speech synthesis system written in pure java
pndurette/gTTS
Python library and CLI tool to interface with Google Translate's text-to-speech API
6drf21e/ChatTTS_colab
🚀 一键部署(含离线整合包)!基于 ChatTTS ,支持流式输出、音色抽卡、长音频生成和分角色朗读。简单易用,无需复杂安装。