Robinatp's Stars
CompVis/stable-diffusion
A latent text-to-image diffusion model
Stability-AI/stablediffusion
High-Resolution Image Synthesis with Latent Diffusion Models
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
openvpi/DiffSinger
An advanced singing voice synthesis system with high fidelity, expressiveness, controllability and flexibility based on DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism
haoheliu/AudioLDM
AudioLDM: Generate speech, sound effects, music and beyond, with text.
archinetai/audio-ai-timeline
A timeline of the latest AI models for audio generation, starting in 2023!
csteinmetz1/ai-audio-startups
Community list of startups working with AI in audio and music technology
lucidrains/naturalspeech2-pytorch
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
PlayVoice/lora-svc
singing voice change based on whisper, and lora for singing voice clone
zhangyongmao/VISinger2
VISinger 2: High-Fidelity End-to-End Singing Voice Synthesis Enhanced by Digital Signal Processing Synthesizer
keonlee9420/DiffGAN-TTS
PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs
xunmengshe/OpenUtau
YatingMusic/ddsp-singing-vocoders
Official implementation of SawSing (ISMIR'22)
adelacvg/NS2VC
Unofficial implementation of NaturalSpeech2 for Voice Conversion and Text to Speech
M4Singer/M4Singer
ncsoft/avocodo
Official implementation of "Avocodo: Generative Adversarial Network for Artifact-Free Vocoder" (AAAI2023)
CODEJIN/NaturalSpeech2
yl4579/HiFTNet
HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform
lesterphillip/SVCC23_FastSVC
Singing Voice Conversion Challenge 2023 Starter Kit: FastSVC Reimplementation
keonlee9420/FastPitchFormant
PyTorch Implementation of NCSOFT's FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis
yousa-ling-official-production/yousa-ling-diffsinger-v1
泠鸢yousa的Diffsinger模型v1版
openvpi/DiffSingerMiniEngine
A minimum inference engine for DiffSinger
timedomain-tech/ACE_phonemes
a guide to grapheme-to-phoneme conversion and phoneme list for ace singing voice synthesis engine
seyong92/phoneme-informed-note-level-singing-transcription
A pretrained model for "A Phoneme-informed Neural Network Model for Note-level Singing Transcription", ICASSP 2023
fishaudio/OpenUtau
OpenUTAU renderer for diffsinger / 适用于diffsinger的OpenUTAU渲染器,使用方法:https://github.com/xunmengshe/OpenUtau/wiki/%E4%BD%BF%E7%94%A8%E6%96%B9%E6%B3%95%EF%BC%88%E4%B8%AD%E6%96%87%EF%BC%89
chomeyama/UnifiedSourceFilterGAN
timedomain-tech/ACE_sequence_file
Open-source file format designed for high-quality, customizable singing synthesis.
A-Quarter-Mile/PHONEix
PHONEix: Acoustic Feature Processing Strategy for Enhanced Singing Pronunciation with Phoneme Distribution Predictor
MaxMax2016/EasyVC
变声技术综合评比
Robinatp/SECaps