linan2's Stars
CMU-Perceptual-Computing-Lab/openpose
OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation
KindXiaoming/pykan
Kolmogorov Arnold Networks
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
modelscope/modelscope
ModelScope: bring the notion of Model-as-a-Service to life.
bytedance/SALMONN
SALMONN: Speech Audio Language Music Open Neural Network
KinWaiCheuk/nnAudio
Audio processing by using pytorch 1D convolution network
wenet-e2e/wespeaker
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
X-LANCE/SLAM-LLM
Speech, Language, Audio, Music Processing with Large Language Model
gabrielmittag/NISQA
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
a171232886/TJUThesis_master_2021
天大博士/硕士学位论文Latex模板,根据2021年版要求修改,可直接在Overleaf上运行。:star:所写的论文成功提交天津大学图书馆存档!(2021.12.24)
Xiaobin-Rong/gtcrn
The official implementation of GTCRN, an ultra-lightweight SE model.
OpenT2S/LlamaVoice
LlamaVoice is a llama-based large voice generation model, providing inference and training ability.
audiolabs/torch-pesq
PyTorch implementation of the Perceptual Evaluation of Speech Quality for wideband audio
xiongyihui/tdoa
TDOA based on GCC-PHAT
qhduan/cn-chat-arxiv
wenet-e2e/wesep
Target Speaker Extraction Toolkit
nii-yamagishilab/ZMM-TTS
ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations
caopulan/iKUNet
cszheng-ioa/Sixty-years-of-frequency-domain-monaural-speech-enhancement
haidog-yaqub/DiffPitcher
Diffusion-based singing voice pitch correction
sungwon23/BSRNN
urgent-challenge/urgent2024_challenge
Official data preparation scripts for the URGENT 2024 Challenge
BingYang-20/SRP-DNN
A python implementation of “SRP-DNN: Learning Direct-Path Phase Difference for Multiple Moving Sound Source Localization” [ICASSP 2022]
caoruitju/RUI_SE
VOICOR: A Residual Iterative Voice Correction Framework for Monaural Speech Enhancement
fchest/DBPNet
DBPNet model
felixperfler/Stable-Hybrid-Auditory-Filterbanks
[Interspeech 2024] Hold Me Tight: Stable Encoder-Decoder Design for Speech Enhancement
SLPcourse/Singing-Voice-Conversion
Project of Singing Voice Conversion.
mrjunjieli/ActiveExtract
linan2/tensorflow-1.4.0
TensorFlow 1.4.0 installed version.
niyah2/Shirley