speech-synthesis
There are 1239 repositories under speech-synthesis topic.
coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
leon-ai/leon
🧠 Leon is your open-source personal assistant.
NVIDIA/DeepLearningExamples
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
PaddlePaddle/PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
voicepaw/so-vits-svc-fork
so-vits-svc fork with realtime support, improved interface and more features.
espnet/espnet
End-to-End Speech Processing Toolkit
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
netease-youdao/EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
jaywalnut310/vits
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
rhasspy/piper
A fast, local neural text to speech system
rany2/edge-tts
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
snakers4/silero-models
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
yl4579/StyleTTS2
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
MoonInTheRiver/DiffSinger
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
espeak-ng/espeak-ng
eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
collabora/WhisperSpeech
An Open Source text-to-speech system built by inverting Whisper.
metavoiceio/metavoice-src
Foundational model for human-like, expressive TTS
TensorSpeech/TensorFlowTTS
:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
huggingface/speech-to-speech
Speech To Speech: an effort for an open-sourced and modular GPT4-o
zzw922cn/awesome-speech-recognition-speech-synthesis-papers
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
keithito/tacotron
A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)
tensorflow/lingvo
Lingvo
Camb-ai/MARS5-TTS
MARS5 speech model (TTS) from CAMB.AI
marytts/marytts
MARY TTS -- an open-source, multilingual text-to-speech synthesis system written in pure java
r9y9/wavenet_vocoder
WaveNet vocoder
cogentapps/chat-with-gpt
An open-source ChatGPT app with a voice
Rayhane-mamah/Tacotron-2
DeepMind's Tacotron-2 Tensorflow implementation
fatchord/WaveRNN
WaveRNN Vocoder + TTS
stakira/OpenUtau
Open singing synthesis platform / Open source UTAU successor
KoljaB/RealtimeTTS
Converts text to speech in realtime
r9y9/deepvoice3_pytorch
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models
jik876/hifi-gan
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
kalliope-project/kalliope
Kalliope is a framework that will help you to create your own personal assistant.
kan-bayashi/ParallelWaveGAN
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
NVIDIA/OpenSeq2Seq
Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP