WenzheLiu-Speech

Hi, I am Wenzhe Liu. I work for Kuaishou, and was employed by Tencent. focusing on generalized speech enhancement, audio codec and speech synthesis

TencentBeijing, China

Pinned Repositories

ADSP_Tutorials
Advanced Signal Processing Notebooks and Tutorials
Language:Jupyter Notebook4 2 02
ai-audio-datasets
AI Audio Datasets 🎵. A list of datasets consisting of speech, music, and sound effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications.
50
awesome-speech-enhancement
speech enhancement\speech seperation\sound source localization
1.1k 42 1224
ChatGLMVoice-SFT
Language:Python30
penguins-aicodec-demo
6 2 00
pyaec
simple and efficient python implemention of a series of adaptive filters (lms、nlms、rls、kalman、Frequency Domain Adaptive Filter、Partitioned-Block-Based Frequency Domain Adaptive Filter、Frequency Domain Kalman Filter、Partitioned-Block-Based Frequency Domain Kalman Filter) for acoustic echo cancellation.
Language:Python3 1 01
Realtime_AudioDenoise_EchoCancellation
Language:C++5 1 00
sound-source-localization-algorithm_DOA_estimation
关于语音信号声源定位DOA估计所用的一些传统算法
Language:MATLAB396 6 784
The-guidebook-of-speech-enhancement
111 3 06
wenzheliu-speech
3 2 01

WenzheLiu-Speech's Repositories

WenzheLiu-Speech/awesome-speech-enhancement
speech enhancement\speech seperation\sound source localization
1.1k 42 1224
WenzheLiu-Speech/The-guidebook-of-speech-enhancement
111 3 06
WenzheLiu-Speech/penguins-aicodec-demo
6 2 00
WenzheLiu-Speech/ai-audio-datasets
AI Audio Datasets 🎵. A list of datasets consisting of speech, music, and sound effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications.
50
WenzheLiu-Speech/ChatGLMVoice-SFT
Language:Python30
WenzheLiu-Speech/wenzheliu-speech
3 2 01
WenzheLiu-Speech/aac-datasets
Audio Captioning datasets for PyTorch.
Language:Python2 1 00
WenzheLiu-Speech/awesome-large-audio-models
Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.
2 1 00
WenzheLiu-Speech/kmeans_pytorch
kmeans using PyTorch
2
WenzheLiu-Speech/MP-SENet
MP-SENet: A Speech Enhancement Model with Parallel Denoising of Magnitude and Phase Spectra
Language:Python2 1 00
WenzheLiu-Speech/speech-synthesis-paper
List of speech synthesis papers.
1
WenzheLiu-Speech/WenzheLiu-Speech.github.io
Language:HTML1 1 01
WenzheLiu-Speech/awesome-voice-conversion
A curated list of awesome voice conversion, projects and communities.
WenzheLiu-Speech/cutword
一个简单快速的分词、命名实体识别工具
Language:Python1 0
WenzheLiu-Speech/EasyRec
A framework for large scale recommendation algorithms.
Language:Python1 0
WenzheLiu-Speech/effective_llm_alignment
Effective LLM Alignment Toolkit
WenzheLiu-Speech/gemma_pytorch
The official PyTorch implementation of Google's Gemma models
Language:Python1 0
WenzheLiu-Speech/gpt-fast
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
WenzheLiu-Speech/MaskGCT-Training
Training code for MaskGCT-T2S model.
WenzheLiu-Speech/McNet
The official repo: "McNet: Fuse Multiple Cues for Multichannel Speech Enhancement", ICASSP 2023
Language:Python1 0
WenzheLiu-Speech/minbpe
Minimal, clean, code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
Language:Python1 0
WenzheLiu-Speech/nano-llama31
nanoGPT style version of Llama 3.1
WenzheLiu-Speech/OpenVoice
Instant voice cloning by MyShell.
WenzheLiu-Speech/SE-IFD
WenzheLiu-Speech/SoundStorm
The reproduced code for Google's SoundStorm
WenzheLiu-Speech/the-algorithm
Source code for Twitter's Recommendation Algorithm
WenzheLiu-Speech/tts-frontend-dataset
TTS FrontEnd DataSet: Polyphone / Prosody / TextNormalization
WenzheLiu-Speech/vall-e
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
WenzheLiu-Speech/vocos
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
WenzheLiu-Speech/XPhoneBERT
XPhoneBERT: A Pre-trained Multilingual Model for Phoneme Representations for Text-to-Speech (INTERSPEECH 2023)