zexupan
Algorithm engineer @ AlibabaGroup; Visiting research scientist @ MERL; PhD @ NUS. Working on speech extraction and multimedia.
National University of SingaporeSingapore
zexupan's Stars
coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
mozilla/TTS
:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
espnet/espnet
End-to-End Speech Processing Toolkit
lixin4ever/Conference-Acceptance-Rate
Acceptance rates for the major AI conferences
52CV/CVPR-2021-Papers
ming024/FastSpeech2
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
kan-bayashi/ParallelWaveGAN
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
YuanGongND/ast
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
yzhuoning/Awesome-CLIP
Awesome list for research on CLIP (Contrastive Language-Image Pre-Training).
xcmyz/FastSpeech
The Implementation of FastSpeech based on pytorch.
krantiparida/awesome-audio-visual
A curated list of different papers and datasets in various areas of audio-visual processing
fgnt/nara_wpe
Different implementations of "Weighted Prediction Error" for speech dereverberation
jefflai108/Contrastive-Predictive-Coding-PyTorch
Contrastive Predictive Coding for Automatic Speaker Verification
TaoRuijie/TalkNet-ASD
ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'
mpariente/pystoi
Python implementation of the Short Term Objective Intelligibility measure
kkoutini/PaSST
Efficient Training of Audio Transformers with Patchout
vb000/Waveformer
A deep neural network architecture for low-latency audio processing
nryant/dscore
Diarization scoring tools.
xuchenglin28/speaker_extraction
target speaker extraction and verification for multi-talker speech
youngwoo-yoon/youtube-gesture-dataset
This repository contains scripts to build Youtube Gesture Dataset.
merlresearch/cocktail-fork-separation
Baseline multi-resolution cross network model trained using the Divide and Remaster Dataset
zcxu-eric/AVA-AVD
zexupan/MuSE
Jiang-Yidi/FlatTrajectoryDistillation_FTD
The code of the paper "Minimizing the Accumulated Trajectory Error to Improve Dataset Distillation" (CVPR2023)
zexupan/avse_hybrid_loss
zexupan/reentry
zexupan/USEV
zexupan/ImagineNET
zexupan/seg
biji0002/EE4208ComputerVision
Face Detection