ygyuan's Stars
LetterLiGo/SafeEar
The Official Code Repo of SafeEar (Accepted by CCS 2024)
hayeong0/DDDM-VC
Official Pytorch Implementation for "DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion" (AAAI 2024)
facebookresearch/speech-resynthesis
An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.
hhguo/SoCodec
Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications
jishengpeng/WavTokenizer
SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling
Plachtaa/seed-vc
zero-shot voice conversion & singing voice conversion with in context learning
svc-develop-team/so-vits-svc
SoftVC VITS Singing Voice Conversion
RVC-Project/Retrieval-based-Voice-Conversion-WebUI
Easily train a good VC model with voice data <= 10 mins!
jiaaro/pydub
Manipulate audio with a simple and easy high level interface
pyannote/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
kyutai-labs/moshi
xingchensong/S3Tokenizer
Reverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice
liutaocode/TTS-arxiv-daily
Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)
FireRedTeam/FireRedTTS
An Open-Sourced LLM-empowered Foundation TTS System
myshell-ai/OpenVoice
Instant voice cloning by MIT and MyShell.
innnky/emotional-vits
无需情感标注的情感可控语音合成模型,基于VITS
lenML/ChatTTS-Forge
🍦 ChatTTS-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.
ymcui/Chinese-ELECTRA
Pre-trained Chinese ELECTRA(中文ELECTRA预训练模型)
thuhcsi/NeuCoSVC
AILab-CVC/YOLO-World
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
FunAudioLLM/SenseVoice
Multilingual Voice Understanding Model
Camb-ai/MARS5-TTS
MARS5 speech model (TTS) from CAMB.AI
lifeiteng/naturalspeech3_facodec
FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3
YoMio-Tech-Inc/GPT-SoVITS2
GPT-SoVITS2
Shengqiang-Li/TTS-Evaluation
Evaluation metrics for TTS model.
DigitalPhonetics/IMS-Toucan
Controllable and fast Text-to-Speech for over 7000 languages!
rany2/edge-tts
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
emo-box/EmoBox
[INTERSPEECH 2024] EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark
SpeechColab/GigaSpeech2
An evolving, large-scale and multi-domain ASR corpus for low-resource languages with automated crawling, transcription and refinement