speech-synthesis

There are 1388 repositories under speech-synthesis topic.

coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Language:Python42.6k 325 1.2k5.6k
leon-ai/leon
🧠 Leon is your open-source personal assistant.
Language:TypeScript16.6k 276 3061.4k
NVIDIA-NeMo/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Language:Python15.7k 229 2.8k3.1k
NVIDIA/DeepLearningExamples
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
Language:Jupyter Notebook14.5k 292 8643.4k
PaddlePaddle/PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
Language:Python12.2k 188 2k1.9k
rhasspy/piper
A fast, local neural text to speech system
Language:C++10k 86 552817
espnet/espnet
End-to-End Speech Processing Toolkit
Language:Python9.5k 168 2.5k2.3k
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
Language:Python9.4k 84 290757
voicepaw/so-vits-svc-fork
so-vits-svc fork with realtime support, improved interface and more features.
Language:Python9.1k 64 3741.2k
rany2/edge-tts
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
Language:Python9.1k 64 280850
netease-youdao/EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
Language:Python8.3k 71 164730
jaywalnut310/vits
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
Language:Python7.7k 54 2091.4k
yl4579/StyleTTS2
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Language:Python6k 78 224613
espeak-ng/espeak-ng
eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
Language:C5.6k 109 1.1k1.1k
snakers4/silero-models
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
Language:Jupyter Notebook5.5k 87 135346
abus-aikorea/voice-pro
Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.
Language:Python4.8k 35 45412
MoonInTheRiver/DiffSinger
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
Language:Python4.6k 42 104770
WhisperSpeech/WhisperSpeech
An Open Source text-to-speech system built by inverting Whisper.
Language:Jupyter Notebook4.4k 78 113249
huggingface/speech-to-speech
Speech To Speech: an effort for an open-sourced and modular GPT4-o
Language:Python4.2k 48 98474
metavoiceio/metavoice-src
Foundational model for human-like, expressive TTS
Language:Python4.2k 83 129694
TensorSpeech/TensorFlowTTS
:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
Language:Python4k 79 688813
denizsafak/abogen
Generate audiobooks from EPUBs, PDFs and text with synchronized captions.
Language:Python3.6k187
KoljaB/RealtimeTTS
Converts text to speech in realtime
Language:Python3.5k 38 202342
stakira/OpenUtau
Open singing synthesis platform / Open source UTAU successor
Language:C#3.1k 63 433394
zzw922cn/awesome-speech-recognition-speech-synthesis-papers
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
3.1k 186 7512
keithito/tacotron
A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)
Language:Python3k 148 323954
tensorflow/lingvo
Lingvo
Language:Python2.9k 116 255452
Camb-ai/MARS5-TTS
MARS5 speech model (TTS) from CAMB.AI
Language:Jupyter Notebook2.8k 36 51246
Blaizzy/mlx-audio
A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.
Language:Python2.7k 8 20210
marytts/marytts
MARY TTS -- an open-source, multilingual text-to-speech synthesis system written in pure java
Language:Java2.5k 133 828764
cogentapps/chat-with-gpt
An open-source ChatGPT app with a voice
Language:TypeScript2.4k 30 143491
r9y9/wavenet_vocoder
WaveNet vocoder
Language:Python2.4k 95 193496
Rayhane-mamah/Tacotron-2
DeepMind's Tacotron-2 Tensorflow implementation
Language:Python2.3k 132 472913
jik876/hifi-gan
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Language:Python2.2k 33 167535
fatchord/WaveRNN
WaveRNN Vocoder + TTS
Language:Python2.2k 87 227695
r9y9/deepvoice3_pytorch
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models
Language:Python2k 92 194487

speech-synthesis

coqui-ai/TTS

leon-ai/leon

NVIDIA-NeMo/NeMo

NVIDIA/DeepLearningExamples

PaddlePaddle/PaddleSpeech

rhasspy/piper

espnet/espnet

open-mmlab/Amphion

voicepaw/so-vits-svc-fork

rany2/edge-tts

netease-youdao/EmotiVoice

jaywalnut310/vits

yl4579/StyleTTS2

espeak-ng/espeak-ng

snakers4/silero-models

abus-aikorea/voice-pro

MoonInTheRiver/DiffSinger

WhisperSpeech/WhisperSpeech

huggingface/speech-to-speech

metavoiceio/metavoice-src

TensorSpeech/TensorFlowTTS

denizsafak/abogen

KoljaB/RealtimeTTS

stakira/OpenUtau

zzw922cn/awesome-speech-recognition-speech-synthesis-papers

keithito/tacotron

tensorflow/lingvo

Camb-ai/MARS5-TTS

Blaizzy/mlx-audio

marytts/marytts

cogentapps/chat-with-gpt

r9y9/wavenet_vocoder

Rayhane-mamah/Tacotron-2

jik876/hifi-gan

fatchord/WaveRNN

r9y9/deepvoice3_pytorch