CaoYuhang

Master Degree At USTC, Speech Enhancement, ASR, LLM

CaoYuhang's Stars

sindresorhus/awesome
😎 Awesome lists about all kinds of interesting topics
354k 7.8k 35028.8k
krahets/hello-algo
《Hello 算法》：动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新，English version ongoing
Language:Java111k 573 25713.9k
RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Language:Python43.3k 235 1.6k4.8k
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
Language:Python25.9k 203 5732.5k
EleutherAI/gpt-neo
An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.
Language:Python8.3k 177 137961
kyutai-labs/moshi
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.
Language:Python7.9k 88 115651
myshell-ai/MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
Language:Python5.8k 44 221781
metavoiceio/metavoice-src
Foundational model for human-like, expressive TTS
Language:Python4.1k 83 129682
Stability-AI/stable-audio-tools
Generative models for conditional audio generation
Language:Python3k 43 110294
Rikorose/DeepFilterNet
Noise supression using deep filtering
Language:Python2.9k 34 301270
Plachtaa/seed-vc
zero-shot voice conversion & singing voice conversion, with real-time support
Language:Python2.1k 40 132232
NUS-HPC-AI-Lab/VideoSys
VideoSys: An easy and efficient system for video generation
Language:Python1.9k 28 91129
juncongmoo/chatllama
ChatLLaMA 📢 Open source implementation for LLaMA-based ChatGPT runnable in a single GPU. 15x faster training process than ChatGPT
Language:Python1.2k 20 8135
ZihanWang314/RAGEN
RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.
Language:Python1.2k 19 2380
NexaAI/Awesome-LLMs-on-device
Awesome LLMs on Device: A Comprehensive Survey
1k 52 2101
ga642381/speech-trident
Awesome speech/audio LLMs, representation learning, and codec models
942 50 359
lucidrains/e2-tts-pytorch
Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch
Language:Python456 26 2644
dongzhuoyao/awesome-flow-matching
A summary of related works about flow matching, stochastic interpolants
415 17 214
AMAAI-Lab/mustango
Mustango: Toward Controllable Text-to-Music Generation
Language:Python357 16 1829
LSimon95/megatts2
Unoffical implementation of Megatts2
Language:Python279 21 2037
AaronZ345/GTSinger
Dataset and code of GTSinger(NeurIPS 2024 Spotlight): A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks
Language:Python259 4 99
OpenT2S/LlamaVoice
LlamaVoice is a llama-based large voice generation model, providing inference and training ability.
Language:Python232 22 314
JusperLee/TIGER
TIGER: Time-frequency Interleaved Gain Extraction and Reconstruction for Efficient Speech Separation
Language:Python221 10 238
qiuqiao/SOFA
SOFA: Singing-Oriented Forced Aligner
Language:Python155 6 1624
ex3ndr/supervoice-voicebox
VoiceBox neural network implementation
Language:Jupyter Notebook105 11 1211
NeuralVox/OpenPhonemizer
An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GPL phonemizer.
Language:Python95 4 75
adelacvg/detail_tts
All generative model in one for better TTS model
Language:Python66 3 28
reppy4620/convnext_tts
Unofficial implementation of ConvNeXt-TTS powered by lightning
Language:Python15 5 13
tangYang7/fluency_scorer
It's unofficial implementation for speech fluency assessment model
Language:Python9 2 45
CaoYuhang/SpeechAlgorithms
Speech Algorithms
Language:C1 0 00

CaoYuhang

CaoYuhang's Stars

sindresorhus/awesome

krahets/hello-algo

RVC-Boss/GPT-SoVITS

hpcaitech/Open-Sora

EleutherAI/gpt-neo

kyutai-labs/moshi

myshell-ai/MeloTTS

metavoiceio/metavoice-src

Stability-AI/stable-audio-tools

Rikorose/DeepFilterNet

Plachtaa/seed-vc

NUS-HPC-AI-Lab/VideoSys

juncongmoo/chatllama

ZihanWang314/RAGEN

NexaAI/Awesome-LLMs-on-device

ga642381/speech-trident

lucidrains/e2-tts-pytorch

dongzhuoyao/awesome-flow-matching

AMAAI-Lab/mustango

LSimon95/megatts2

AaronZ345/GTSinger

OpenT2S/LlamaVoice

JusperLee/TIGER

qiuqiao/SOFA

ex3ndr/supervoice-voicebox

NeuralVox/OpenPhonemizer

adelacvg/detail_tts

reppy4620/convnext_tts

tangYang7/fluency_scorer

CaoYuhang/SpeechAlgorithms