Labmem-Zhouyx

Focus on TTS/Speech/NLP. El Psy Congroo

Tsinghua UniversityShenzhen, Guangdong

Labmem-Zhouyx's Stars

lifeiteng/vall-e
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
Language:Python2k319
enhuiz/vall-e
An unofficial PyTorch implementation of the audio LM VALL-E
Language:Python2.9k418
lucidrains/audiolm-pytorch
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
Language:Python2.4k255
roedoejet/FastSpeech2_ACL2022_reproducibility
Language:Python214
facebookresearch/libri-light
dataset for lightly supervised training using the librivox audio book recordings. https://librivox.org/.
Language:Python47376
Edresson/YourTTS
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
Language:Jupyter Notebook88077
facebookresearch/encodec
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
Language:Python3.4k306
archinetai/audio-diffusion-pytorch
Audio generation using diffusion models, in PyTorch.
Language:Python1.9k167
tts-tutorial/book
601
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Language:Python11.6k2.4k
microsoft/GLIP
Grounded Language-Image Pre-training
Language:Python2.2k191
floodsung/Deep-Learning-Papers-Reading-Roadmap
Deep Learning papers reading roadmap for anyone who are eager to learn this amazing tech!
Language:Python38k7.3k
zzw922cn/awesome-speech-recognition-speech-synthesis-papers
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
3k513
jbmouret/matplotlib_for_papers
Handout for the tutorial "Creating publication-quality figures with matplotlib"
Language:Jupyter Notebook2.1k298
CompVis/stable-diffusion
A latent text-to-image diffusion model
Language:Jupyter Notebook67.7k10.1k
geekjuruo/ProbExpan
SIGIR 2022: Contrastive Learning with Hard Negative Entities for Entity Set Expansion
Language:Python302
facebookresearch/mae
PyTorch implementation of MAE https//arxiv.org/abs/2111.06377
Language:Python7.2k1.2k
cnlinxi/book-text-to-speech
A book about Text-to-Speech (TTS) in Chinese.
Language:TeX57878
kan-bayashi/LibriTTSLabel
Alignment files of LibriTTS.
587
ivy-llc/ivy
Convert Machine Learning Code Between Frameworks
Language:Python14k5.8k
jasminsternkopf/mel_cepstral_distance
Computes the Mel-Cepstral Distance of two WAV files based on the paper "Mel-Cepstral Distance Measure for Objective Speech Quality Assessment" by Robert F. Kubichek.
Language:Python4610
openai/CLIP
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Language:Jupyter Notebook24.9k3.2k
microsoft/LoRA
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
Language:Python10.4k667
microsoft/Graphormer
Graphormer is a general-purpose deep learning backbone for molecular modeling.
Language:Python2.1k334
HLTSingapore/Emotional-Speech-Data
This is the GitHub page for publicly available emotional speech data.
31622
neonbjb/tortoise-tts
A multi-voice TTS system trained with an emphasis on quality
Language:Jupyter Notebook13k1.8k
NVIDIA/BigVGAN
Official PyTorch implementation of BigVGAN (ICLR 2023)
Language:Python84997
afatcoder/LeetcodeTop
汇总各大互联网公司容易考察的高频leetcode题🔥
18.6k2.7k
tuanh123789/AdaSpeech
An implementation of Microsoft's "AdaSpeech: Adaptive Text to Speech for Custom Voice"
Language:Python9627
TencentGameMate/chinese_speech_pretrain
chinese speech pretrained models
Language:Shell1k83

Labmem-Zhouyx

Labmem-Zhouyx's Stars

lifeiteng/vall-e

enhuiz/vall-e

lucidrains/audiolm-pytorch

roedoejet/FastSpeech2_ACL2022_reproducibility

facebookresearch/libri-light

Edresson/YourTTS

facebookresearch/encodec

archinetai/audio-diffusion-pytorch

tts-tutorial/book

NVIDIA/NeMo

microsoft/GLIP

floodsung/Deep-Learning-Papers-Reading-Roadmap

zzw922cn/awesome-speech-recognition-speech-synthesis-papers

jbmouret/matplotlib_for_papers

CompVis/stable-diffusion

geekjuruo/ProbExpan

facebookresearch/mae

cnlinxi/book-text-to-speech

kan-bayashi/LibriTTSLabel

ivy-llc/ivy

jasminsternkopf/mel_cepstral_distance

openai/CLIP

microsoft/LoRA

microsoft/Graphormer

HLTSingapore/Emotional-Speech-Data

neonbjb/tortoise-tts

NVIDIA/BigVGAN

afatcoder/LeetcodeTop

tuanh123789/AdaSpeech

TencentGameMate/chinese_speech_pretrain