segmentationFaults's Stars
xingchensong/S3Tokenizer
Reverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice
MoyGcc/vid2avatar
Vid2Avatar: 3D Avatar Reconstruction from Videos in the Wild via Self-supervised Scene Decomposition (CVPR2023)
2noise/ChatTTS
A generative speech model for daily dialogue.
jasonppy/VoiceCraft
Zero-Shot Speech Editing and Text-to-Speech in the Wild
segmentationFaults/VoiceCraft
Zero-Shot Speech Editing and Text-to-Speech in the Wild
JosephPai/Awesome-Talking-Face
📖 A curated list of resources dedicated to talking face.
daotrungkien/mysql-modern-cpp
Lightweight header-only wrapper for MySQL with simple and convenient usage in modern C++ (C++11 or later)
fishaudio/fish-speech
Brand new TTS solution
microsoft/CLAP
Learning audio concepts from natural language supervision
jim-schwoebel/voice_datasets
🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).
sh-lee-prml/HierSpeechpp
The official implementation of HierSpeech++
X-LANCE/UniCATS-CTX-vec2wav
[AAAI 2024] Code for CTX-vec2wav in UniCATS
collabora/WhisperSpeech
An Open Source text-to-speech system built by inverting Whisper.
manmay-nakhashi/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
gemelo-ai/vocos
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
yeyupiaoling/AudioClassification-Pytorch
The Pytorch implementation of sound classification supports EcapaTdnn, PANNS, TDNN, Res2Net, ResNetSE and other models, as well as a variety of preprocessing methods.
k2-fsa/multi_quantization
asteroid-team/torch-audiomentations
Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
wenet-e2e/wespeaker
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
shibing624/pycorrector
pycorrector is a toolkit for text error correction. 文本纠错,实现了Kenlm,T5,MacBERT,ChatGLM3,Qwen2.5等模型应用在纠错场景,开箱即用。
s3prl/s3prl
Self-Supervised Speech Pre-training and Representation Learning Toolkit
segmentationFaults/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
alphacep/vosk-server
WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries
k2-fsa/icefall
k2-fsa/k2
FSA/FST algorithms, differentiable, with PyTorch compatibility.
thuhcsi/Crystal
Crystal - C++ implementation of a unified framework for multilingual TTS synthesis engine with SSML specification as interface.
Liu-Feng-deeplearning/TTS-frontend
TTS-frontend with Bert and CRF/lstm (For Tacotron)
midas-research/audino
Open source audio annotation tool for humans
caolanm/callcatcher
find unused code by collecting methods defined but not called or referenced
TensorSpeech/TensorFlowTTS
:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)