segmentationFaults

segmentationFaults's Stars

xingchensong/S3Tokenizer
Reverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice
Language:Python1079
MoyGcc/vid2avatar
Vid2Avatar: 3D Avatar Reconstruction from Videos in the Wild via Self-supervised Scene Decomposition (CVPR2023)
Language:Python1.2k102
2noise/ChatTTS
A generative speech model for daily dialogue.
Language:Python31.4k3.4k
jasonppy/VoiceCraft
Zero-Shot Speech Editing and Text-to-Speech in the Wild
Language:Jupyter Notebook7.5k739
segmentationFaults/VoiceCraft
Zero-Shot Speech Editing and Text-to-Speech in the Wild
1
JosephPai/Awesome-Talking-Face
📖 A curated list of resources dedicated to talking face.
1.3k111
daotrungkien/mysql-modern-cpp
Lightweight header-only wrapper for MySQL with simple and convenient usage in modern C++ (C++11 or later)
Language:C++4919
fishaudio/fish-speech
Brand new TTS solution
Language:Python13.3k989
microsoft/CLAP
Learning audio concepts from natural language supervision
Language:Python46735
jim-schwoebel/voice_datasets
🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).
1.7k225
sh-lee-prml/HierSpeechpp
The official implementation of HierSpeech++
Language:Python1.2k134
X-LANCE/UniCATS-CTX-vec2wav
[AAAI 2024] Code for CTX-vec2wav in UniCATS
Language:Python11916
collabora/WhisperSpeech
An Open Source text-to-speech system built by inverting Whisper.
Language:Jupyter Notebook3.8k208
manmay-nakhashi/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Language:Python5
gemelo-ai/vocos
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
Language:Python77888
yeyupiaoling/AudioClassification-Pytorch
The Pytorch implementation of sound classification supports EcapaTdnn, PANNS, TDNN, Res2Net, ResNetSE and other models, as well as a variety of preprocessing methods.
Language:Python39279
k2-fsa/multi_quantization
Language:Python419
asteroid-team/torch-audiomentations
Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
Language:Python93487
wenet-e2e/wespeaker
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
Language:Python698115
shibing624/pycorrector
pycorrector is a toolkit for text error correction. 文本纠错，实现了Kenlm，T5，MacBERT，ChatGLM3，Qwen2.5等模型应用在纠错场景，开箱即用。
Language:Python5.5k1.1k
s3prl/s3prl
Self-Supervised Speech Pre-training and Representation Learning Toolkit
Language:Python2.2k484
segmentationFaults/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
1
alphacep/vosk-server
WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries
Language:Python906244
k2-fsa/icefall
Language:Python906288
k2-fsa/k2
FSA/FST algorithms, differentiable, with PyTorch compatibility.
Language:Cuda1.1k213
thuhcsi/Crystal
Crystal - C++ implementation of a unified framework for multilingual TTS synthesis engine with SSML specification as interface.
Language:C++22068
Liu-Feng-deeplearning/TTS-frontend
TTS-frontend with Bert and CRF/lstm (For Tacotron)
Language:Python4917
midas-research/audino
Open source audio annotation tool for humans
Language:JavaScript1.1k128
caolanm/callcatcher
find unused code by collecting methods defined but not called or referenced
Language:Python9515
TensorSpeech/TensorFlowTTS
:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
Language:Python3.8k812