Pinned Repositories
asteroid
The PyTorch-based audio source separation toolkit for researchers || Pretrained models available
athena-signal
audio-visual-speech-enhancement
Official Implementation of "Visual Speech Enhancement", Interspeech 2018.
av-se
Deep-Learning-Based Audio-Visual Speech Enhancement and Separation
avobjects
Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"
awesome-multimodal-ml
Reading list for research topics in multimodal machine learning
awesome-speech-recognition-speech-synthesis-papers
Speech synthesis, voice conversion, self-supervised learning, music generation,Automatic Speech Recognition, Speaker Verification, Speech Synthesis, Language Modeling
bsseval
audio source separation evaluation metrics
CodingInterviewChinese2
《剑指Offer》第二版源代码
ConferencingSpeech2022
Non-intrusive Objective Speech Quality Assessment (NISQA) Challenge in Online Conferencing Applications
wangyang199609's Repositories
wangyang199609/SpEx_Plus
SpEx+(tied) source code
wangyang199609/Speech-measure-SDR-SAR-STOI-PESQ
Speech quality measure of SDR、SAR、STOI、ESTOI、PESQ via MATLAB
wangyang199609/SpEx
Implementation of "SpEx: Multi-Scale Time Domain Speaker Extraction Network".
wangyang199609/phasen
A unofficial Pytorch implementation of Microsoft's PHASEN
wangyang199609/CodingInterviewChinese2
《剑指Offer》第二版源代码
wangyang199609/kaldi
This is the official location of the Kaldi project.
wangyang199609/audio-visual-speech-enhancement
Official Implementation of "Visual Speech Enhancement", Interspeech 2018.
wangyang199609/bsseval
audio source separation evaluation metrics
wangyang199609/awesome-speech-recognition-speech-synthesis-papers
Speech synthesis, voice conversion, self-supervised learning, music generation,Automatic Speech Recognition, Speaker Verification, Speech Synthesis, Language Modeling
wangyang199609/MultimodalAnalysis_SpeakerDiarization
The project tries to solve a speaker diarization problem using audio features, face recognition and video feature extraction from face image, mouth tracking.
wangyang199609/VGGVox
VGGVox models for Speaker Identification and Verification trained on the VoxCeleb (1 & 2) datasets