Levent9's Stars
jaejunL/HYFace
naver-ai/facetts
facebookresearch/muavic
MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
zhvng/open-musiclm
Implementation of MusicLM, a text to music model published by Google Research, with a few modifications.
TencentGameMate/chinese_speech_pretrain
chinese speech pretrained models
NVIDIA/audio-flamingo
PyTorch implementation of Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities.
gabrielmittag/NISQA
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
jishengpeng/ControlSpeech
ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and Zero-shot Language Style Control With Decoupled Codec
YangAi520/LL-NSPP
modelscope/FunCodec
FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.
yxlu-0102/AP-BWE
Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction
YangAi520/LFS-NSPP
YangAi520/APNet
YangAi520/NSPP
Levent9/Zero-shot-FaceVC
danoneata/xts
being a multi-speaker video-to-speech network
yxlu-0102/MP-SENet
Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement
yangdongchao/InstructTTS
The deme page of InstructTTS
Wendison/VQMIVC
Official implementation of VQMIVC: One-shot (any-to-any) Voice Conversion @ Interspeech 2021 + Online playing demo!