Labmem-Zhouyx

Focus on TTS/Speech/NLP. El Psy Congroo

Tsinghua UniversityShenzhen, Guangdong

Labmem-Zhouyx's Stars

huggingface/diffusers
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
Language:Python23.8k4.9k
wenet-e2e/WeTextProcessing
Text Normalization & Inverse Text Normalization
Language:Python39961
k2-fsa/libriheavy
Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context
Language:Python15310
ga642381/speech-trident
Awesome speech/audio LLMs, representation learning, and codec models
47823
BytedanceSpeech/seed-tts-eval
Language:Python71869
X-LANCE/SLAM-LLM
Speech, Language, Audio, Music Processing with Large Language Model
Language:Python38429
DanielLin94144/StyleTalk
Official release of StyleTalk dataset.
512
Tencent/HunyuanDiT
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Language:Python2.6k176
CNChTu/Diffusion-SVC
Language:Python38057
facebookresearch/AudioMAE
This repo hosts the code and models of "Masked Autoencoders that Listen".
Language:Python50043
OpenNSP/Hifi-vaegan
Language:Python384
X-LANCE/StoryTTS
[ICASSP 2024] StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations
Language:HTML1294
meta-llama/llama3
The official Meta Llama 3 GitHub site
Language:Python22.5k2.3k
huggingface/parler-tts
Inference and training library for high-quality TTS models.
Language:Python2.8k289
nachifur/RDDM
CVPR 2024: Residual Denoising Diffusion Models
Language:Python23926
thuhcsi/SECap
Language:Python1019
facebookresearch/seamless_communication
Foundational Models for State-of-the-Art Speech and Text Translation
Language:Jupyter Notebook10.5k1k
lonePatient/awesome-pretrained-chinese-nlp-models
Awesome Pretrained Chinese NLP Models，高质量中文预训练模型&大模型&多模态模型&大语言模型集合
Language:Python4.4k438
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
Language:Python4.1k348
bytedance/SALMONN
SALMONN: Speech Audio Language Music Open Neural Network
Language:Python86760
AMAAI-Lab/mustango
Mustango: Toward Controllable Text-to-Music Generation
Language:Python29725
descriptinc/descript-audio-codec
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
Language:Python99582
keonlee9420/Parallel-Tacotron2
PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
Language:Python18644
fighting41love/zhvoice
Chinese voice corpus. 中文语音语料，语音更加清晰自然，包含8个开源数据集，3200个说话人，900小时语音，1300万字。
524109
daniilrobnikov/vits2
VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design
Language:Jupyter Notebook41344
neonbjb/DL-Art-School
DLAS - A configuration-driven trainer for generative models
Language:Python129119
152334H/DL-Art-School
TorToiSe fine-tuning with DLAS
Language:Python20584
keonlee9420/Comprehensive-Transformer-TTS
A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS
Language:Python31541
DmitryRyumin/INTERSPEECH-2023-Papers
INTERSPEECH 2023 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!
60641
declare-lab/adapter-mix
Language:Python134

Labmem-Zhouyx

Labmem-Zhouyx's Stars

huggingface/diffusers

wenet-e2e/WeTextProcessing

k2-fsa/libriheavy

ga642381/speech-trident

BytedanceSpeech/seed-tts-eval

X-LANCE/SLAM-LLM

DanielLin94144/StyleTalk

Tencent/HunyuanDiT

CNChTu/Diffusion-SVC

facebookresearch/AudioMAE

OpenNSP/Hifi-vaegan

X-LANCE/StoryTTS

meta-llama/llama3

huggingface/parler-tts

nachifur/RDDM

thuhcsi/SECap

facebookresearch/seamless_communication

lonePatient/awesome-pretrained-chinese-nlp-models

open-mmlab/Amphion

bytedance/SALMONN

AMAAI-Lab/mustango

descriptinc/descript-audio-codec

keonlee9420/Parallel-Tacotron2

fighting41love/zhvoice

daniilrobnikov/vits2

neonbjb/DL-Art-School

152334H/DL-Art-School

keonlee9420/Comprehensive-Transformer-TTS

DmitryRyumin/INTERSPEECH-2023-Papers

declare-lab/adapter-mix