wen0320

Speech Separation, TTS

guangzhou

wen0320's Stars

wenet-e2e/wespeaker
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
Language:Python736123
RVC-Project/Retrieval-based-Voice-Conversion-WebUI
Easily train a good VC model with voice data <= 10 mins!
Language:Python24.8k3.6k
astral-sh/uv
An extremely fast Python package and project manager, written in Rust.
Language:Rust27.8k800
THUDM/CogVideo
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Language:Python9.4k890
VITA-MLLM/Freeze-Omni
✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM
Language:Python1143
haidog-yaqub/EzAudio
High-quality Text-to-Audio Generation with Efficient Diffusion Transformer
Language:Python2388
opendilab/CleanS2S
High-quality and streaming Speech-to-Speech interactive agent in a single file. 只用一个文件实现的流式全双工语音交互原型智能体！
Language:Python26121
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
Language:Jupyter Notebook7.8k589
THUDM/GLM-4-Voice
GLM-4-Voice | 端到端中英语音对话模型
Language:Python2.3k188
haoheliu/AudioLDM2
Text-to-Audio/Music Generation
Language:Python2.3k182
haoheliu/AudioLDM
AudioLDM: Generate speech, sound effects, music and beyond, with text.
Language:Python2.5k224
facebookresearch/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
Language:Python21.1k2.2k
suno-ai/bark
🔊 Text-Prompted Generative Audio Model
Language:Jupyter Notebook36.2k4.3k
Audio-AGI/WavJourney
WavJourney: Compositional Audio Creation with LLMs
Language:Python52245
Bai-YT/ConsistencyTTA
ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation
Language:Python32
open-mmlab/FoleyCrafter
FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds. AI拟音大师，给你的无声视频添加生动而且同步的音效 😝
Language:Python47241
Stability-AI/stable-audio-tools
Generative models for conditional audio generation
Language:Python2.7k259
liutaocode/TTS-arxiv-daily
Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)
Language:Python29821
ivcylc/qa-mdt
OpenMusic: SOTA Text-to-music (TTM) Generation
Language:Python48647
SWivid/F5-TTS
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Language:Python7.5k933
2noise/ChatTTS
A generative speech model for daily dialogue.
Language:Python32.6k3.5k
yangdongchao/RSTnet
Real-time Speech-Text Foundation Model Toolkit (wip)
Language:Python12311
yeyupiaoling/SpeechEmotionRecognition-Pytorch
基于Pytorch实现的语音情感识别
Language:Python13525
xinchen-ai/Westlake-Omni
Language:Python17914
wdndev/llm_interview_note
主要记录大语言大模型（LLMs）算法（应用）工程师相关的知识及面试题
Language:HTML3.9k444
LetterLiGo/SafeEar
SafeEar: Content Privacy-Preserving Audio Deepfake Detection (Accepted by CCS 2024)
Language:Python458
CARNIVAL-IITP/Packet_loss_concealment
Language:Python2919
Crystalsound/FRN
Language:Python266
breizhn/tPLCnet
This repository contains the trained models and some audio samples for the tPLCnet.
Language:Python232
xiph/LPCNet
Efficient neural speech synthesis
Language:C1.1k295

wen0320

wen0320's Stars

wenet-e2e/wespeaker

RVC-Project/Retrieval-based-Voice-Conversion-WebUI

astral-sh/uv

THUDM/CogVideo

VITA-MLLM/Freeze-Omni

haidog-yaqub/EzAudio

opendilab/CleanS2S

open-mmlab/Amphion

THUDM/GLM-4-Voice

haoheliu/AudioLDM2

haoheliu/AudioLDM

facebookresearch/audiocraft

suno-ai/bark

Audio-AGI/WavJourney

Bai-YT/ConsistencyTTA

open-mmlab/FoleyCrafter

Stability-AI/stable-audio-tools

liutaocode/TTS-arxiv-daily

ivcylc/qa-mdt

SWivid/F5-TTS

2noise/ChatTTS

yangdongchao/RSTnet

yeyupiaoling/SpeechEmotionRecognition-Pytorch

xinchen-ai/Westlake-Omni

wdndev/llm_interview_note

LetterLiGo/SafeEar

CARNIVAL-IITP/Packet_loss_concealment

Crystalsound/FRN

breizhn/tPLCnet

xiph/LPCNet