sunxh16

sunxh16's Stars

voidful/Codec-SUPERB
Audio Codec Speech processing Universal PERformance Benchmark
Language:Python20122
ZhangXInFD/SpeechTokenizer
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on
Language:Python41437
2noise/ChatTTS
A generative speech model for daily dialogue.
Language:Python30.8k3.3k
jishengpeng/Languagecodec
Language-Codec: Reducing the Gaps Between Discrete Codec Representation and Speech Language Models
Language:Python20716
willisma/SiT
Official PyTorch Implementation of "SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers"
Language:Python59227
bytedance/Make-An-Audio-2
a text-conditional diffusion probabilistic model capable of generating high fidelity audio.
Language:Python11814
baofff/U-ViT
A PyTorch implementation of the paper "All are Worth Words: A ViT Backbone for Diffusion Models".
Language:Jupyter Notebook88958
Text-to-Audio/Make-An-Audio
PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio Generative Model
Language:Python737107
Tele-AI/TeleSpeech-ASR
Language:Python46539
ga642381/speech-trident
Awesome speech/audio LLMs, representation learning, and codec models
58226
liutaocode/TTS-arxiv-daily
Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)
Language:Python21318
soimort/you-get
:arrow_double_down: Dumb downloader that scrapes the web
Language:Python50.2k9.4k
metame-ai/awesome-audio-plaza
Daily tracking of awesome audio papers, including music generation, zero-shot tts, asr, audio generation
30611
RetroCirce/MusicLDM
The latent diffusion model for text-to-music generation.
Language:Python1513
Stability-AI/stable-audio-tools
Generative models for conditional audio generation
Language:Python2.5k232
facebookresearch/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
Language:Python20.6k2.1k
horseee/Awesome-Efficient-LLM
A curated list for Efficient Large Language Models
Language:Python1.1k80
dvlab-research/MGM
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
Language:Python3.2k279
sh-lee-prml/HierSpeechpp
The official implementation of HierSpeech++
Language:Python1.2k136
huggingface/dataspeech
Language:Python27235
lucidrains/naturalspeech2-pytorch
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
Language:Python1.3k100
scutcsq/Neural-Transducers-for-Two-Stage-Text-to-Speech-via-Semantic-Token-Prediction
Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (arXiv:2401.01498)
Language:Python584
KdaiP/StableTTS
Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3
Language:Python33536
liguodongiot/llm-action
本项目旨在分享大模型相关技术原理以及实战经验。
Language:HTML9.2k897
lucidrains/spear-tts-pytorch
Implementation of Spear-TTS - multi-speaker text-to-speech attention network, in Pytorch
Language:Python25018
huggingface/distil-whisper
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
Language:Python3.5k275
cpdu/vallt
351
suno-ai/bark
🔊 Text-Prompted Generative Audio Model
Language:Jupyter Notebook35.3k4.2k
PolyAI-LDN/pheme
Language:Python24422
Jackiexiao/tts-frontend-dataset
TTS FrontEnd DataSet: Polyphone / Prosody / TextNormalization
Language:Python7915

sunxh16

sunxh16's Stars

voidful/Codec-SUPERB

ZhangXInFD/SpeechTokenizer

2noise/ChatTTS

jishengpeng/Languagecodec

willisma/SiT

bytedance/Make-An-Audio-2

baofff/U-ViT

Text-to-Audio/Make-An-Audio

Tele-AI/TeleSpeech-ASR

ga642381/speech-trident

liutaocode/TTS-arxiv-daily

soimort/you-get

metame-ai/awesome-audio-plaza

RetroCirce/MusicLDM

Stability-AI/stable-audio-tools

facebookresearch/audiocraft

horseee/Awesome-Efficient-LLM

dvlab-research/MGM

sh-lee-prml/HierSpeechpp

huggingface/dataspeech

lucidrains/naturalspeech2-pytorch

scutcsq/Neural-Transducers-for-Two-Stage-Text-to-Speech-via-Semantic-Token-Prediction

KdaiP/StableTTS

liguodongiot/llm-action

lucidrains/spear-tts-pytorch

huggingface/distil-whisper

cpdu/vallt

suno-ai/bark

PolyAI-LDN/pheme

Jackiexiao/tts-frontend-dataset