blues-green

blues-green's Stars

RVC-Project/Retrieval-based-Voice-Conversion-WebUI
Easily train a good VC model with voice data <= 10 mins!
Language:Python28k 188 1.8k4k
black-forest-labs/flux
Official inference repo for FLUX.1 models
Language:Python20.9k 177 2021.5k
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Language:Python13.4k 215 2.5k2.7k
PaddlePaddle/PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
Language:Python11.7k 187 2k1.9k
SWivid/F5-TTS
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Language:Python10.6k 100 5601.5k
keithito/tacotron
A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)
Language:Python3k 148 323959
Rayhane-mamah/Tacotron-2
DeepMind's Tacotron-2 Tensorflow implementation
Language:Python2.3k 132 472913
ming024/FastSpeech2
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
Language:Python2k 29 221561
Kyubyong/tacotron
A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model
Language:Python1.8k 124 116436
modelscope/3D-Speaker
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
Language:Python1.8k 22 134149
feizc/FluxMusic
Text-to-Music Generation with Rectified Flow Transformers
Language:Python1.7k 19 24134
LTH14/mar
PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838
Language:Python1.4k 18 7677
bytedance/music_source_separation
Language:Python1.3k 26 64198
BytedanceSpeech/seed-tts-eval
Language:Python1.2k 13 16113
sihyun-yu/REPA
[ICLR'25 Oral] Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
Language:Python881 17 3343
xcmyz/FastSpeech
The Implementation of FastSpeech based on pytorch.
Language:Python867 35 97213
lturing/tacotronv2_wavernn_chinese
tacotronV2 + wavernn 实现中文语音合成(Tensorflow + pytorch)
Language:Python535 9 63133
Vaibhavs10/fast-whisper-finetuning
Language:Jupyter Notebook499 9 1740
KdaiP/StableTTS
Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3
Language:Python395 23 2342
hugofloresgarcia/vampnet
music generation with masked transformers!
Language:Python322 8 3637
Executedone/Chinese-FastSpeech2
基于标贝数据继续训练，同时对原本的FastSpeech2模型做了改进，引入了韵律表征以及韵律预测模块，使中文发音更生动且富有节奏
Language:Python258 6 2142
haoheliu/AudioLDM-training-finetuning
AudioLDM training, finetuning, evaluation and inference.
Language:Python239 16 4348
nicolaus625/FM4Music
The official GitHub page for the survey paper "Foundation Models for Music: A Survey".
198 7 16
chitosai/eye_protector
May it be the best eye protecting extension on chrome.
Language:JavaScript195 8 1716
haoheliu/SemantiCodec-inference
Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.
Language:Python193 5 1115
lifeiteng/naturalspeech3_facodec
FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3
Language:Python193 6 815
archinetai/cqt-pytorch
An invertible and differentiable implementation of the Constant-Q Transform (CQT).
Language:Python58 6 04
ZZWaang/whole-song-gen
Language:Python40 2 32
biboamy/instrument-streaming
Language:Python37 1 63
DezhiKong00/Sentencepiece-chinese-bbpe
使用Sentencepiece对中文语料进行分词
Language:Python10 1 15