npujcong's Stars
suno-ai/bark
🔊 Text-Prompted Generative Audio Model
chenfei-wu/TaskMatrix
openai/consistency_models
Official repo for consistency models.
facebookresearch/encodec
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
openai/improved-diffusion
Release for Improved Denoising Diffusion Probabilistic Models
serp-ai/bark-with-voice-clone
🔊 Text-prompted Generative Audio Model - With the ability to clone voices
enhuiz/vall-e
An unofficial PyTorch implementation of the audio LM VALL-E
archinetai/audio-diffusion-pytorch
Audio generation using diffusion models, in PyTorch.
archinetai/audio-ai-timeline
A timeline of the latest AI models for audio generation, starting in 2023!
DigitalPhonetics/IMS-Toucan
Controllable and fast Text-to-Speech for over 7000 languages!
LAION-AI/CLAP
Contrastive Language-Audio Pretraining
Harmonai-org/sample-generator
Tools to train a generative model on arbitrary audio samples
TencentGameMate/chinese_speech_pretrain
chinese speech pretrained models
lucidrains/voicebox-pytorch
Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch
ZhangXInFD/SpeechTokenizer
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on
modelscope/KAN-TTS
KAN-TTS is a speech-synthesis training framework, please try the demos we have posted at https://modelscope.cn/models?page=1&tasks=text-to-speech
guan-yuan/Awesome-Singing-Voice-Synthesis-and-Singing-Voice-Conversion
A paper and project list about the cutting edge Speech Synthesis, Text-to-Speech (TTS), Singing Voice Synthesis (SVS), Voice Conversion (VC), Singing Voice Conversion (SVC), and related interesting works (such as Music Synthesis, Automatic Music Transcription, Automatic MOS Prediction, SSL-based ASR...etc).
yl4579/StyleTTS
Official Implementation of StyleTTS
pystiche/pystiche
Framework for Neural Style Transfer (NST) built upon PyTorch
interactiveaudiolab/penn
Pitch Estimating Neural Networks (PENN)
152334H/DL-Art-School
TorToiSe fine-tuning with DLAS
ddlBoJack/Awesome-Speech-Pretraining
Paper, Code and Statistics for Self-Supervised Learning and Pre-Training on Speech.
Zain-Jiang/Speech-Editing-Toolkit
It's a repository for implementations of neural speech editing algorithms.
yl4579/StyleTTS-VC
Official Implementation of StyleTTS-VC
hhguo/MSMC-TTS
Official Implement of Multi-Stage Multi-Codebook (MSMC) TTS
xinjli/transphone
phoneme tokenizer and grapheme-to-phoneme model for 8k languages
neonbjb/tts-scores
Scripts for computing the Intelligibility and CLVP scores for evaluating TTS models
tts-tutorial/book
xrenaa/Retriever
[ICLR2022] Code for "Retriever: Learning Content-Style Representation as a Token-Level Bipartite Graph"
ictnlp/GMA
Code for ACL 2022 findings paper "Gaussian Multi-head Attention for Simultaneous Machine Translation"