zexupan

Algorithm engineer @ AlibabaGroup; Visiting research scientist @ MERL; PhD @ NUS. Working on speech extraction and multimedia.

National University of SingaporeSingapore

zexupan's Stars

coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Language:Python36.4k 298 1.1k4.5k
mozilla/TTS
:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
Language:Jupyter Notebook9.5k 185 5661.3k
espnet/espnet
End-to-End Speech Processing Toolkit
Language:Python8.6k 177 2.4k2.2k
lixin4ever/Conference-Acceptance-Rate
Acceptance rates for the major AI conferences
Language:Jupyter Notebook4.3k 131 29305
52CV/CVPR-2021-Papers
2.5k 66 21314
ming024/FastSpeech2
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
Language:Python1.9k 28 220544
kan-bayashi/ParallelWaveGAN
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
Language:Jupyter Notebook1.6k 47 256343
YuanGongND/ast
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
Language:Jupyter Notebook1.2k 17 137219
yzhuoning/Awesome-CLIP
Awesome list for research on CLIP (Contrastive Language-Image Pre-Training).
1.2k 19 1557
xcmyz/FastSpeech
The Implementation of FastSpeech based on pytorch.
Language:Python861 34 97213
krantiparida/awesome-audio-visual
A curated list of different papers and datasets in various areas of audio-visual processing
681 18 267
fgnt/nara_wpe
Different implementations of "Weighted Prediction Error" for speech dereverberation
Language:Python496 20 37164
jefflai108/Contrastive-Predictive-Coding-PyTorch
Contrastive Predictive Coding for Automatic Speaker Verification
Language:Python483 4 21100
TaoRuijie/TalkNet-ASD
ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'
Language:Python333 8 6978
mpariente/pystoi
Python implementation of the Short Term Objective Intelligibility measure
Language:MATLAB329 13 1959
kkoutini/PaSST
Efficient Training of Audio Transformers with Patchout
Language:Python312 5 4651
vb000/Waveformer
A deep neural network architecture for low-latency audio processing
Language:Python291 6 534
nryant/dscore
Diarization scoring tools.
Language:Python228 8 443
xuchenglin28/speaker_extraction
target speaker extraction and verification for multi-talker speech
Language:Python167 8 529
youngwoo-yoon/youtube-gesture-dataset
This repository contains scripts to build Youtube Gesture Dataset.
Language:Python120 4 918
merlresearch/cocktail-fork-separation
Baseline multi-resolution cross network model trained using the Divide and Remaster Dataset
Language:Python77 4 212
zcxu-eric/AVA-AVD
Language:Python45 2 63
zexupan/MuSE
Language:Python32 1 64
Jiang-Yidi/FlatTrajectoryDistillation_FTD
The code of the paper "Minimizing the Accumulated Trajectory Error to Improve Dataset Distillation" (CVPR2023)
Language:Python18 0 01
zexupan/avse_hybrid_loss
Language:Python15 1 01
zexupan/reentry
Language:Python14 1 24
zexupan/USEV
Language:Python13 3 20
zexupan/ImagineNET
Language:Python4 1 10
zexupan/seg
Language:Python4 1 00
biji0002/EE4208ComputerVision
Face Detection
Language:Python2 0 03

zexupan

zexupan's Stars

coqui-ai/TTS

mozilla/TTS

espnet/espnet

lixin4ever/Conference-Acceptance-Rate

52CV/CVPR-2021-Papers

ming024/FastSpeech2

kan-bayashi/ParallelWaveGAN

YuanGongND/ast

yzhuoning/Awesome-CLIP

xcmyz/FastSpeech

krantiparida/awesome-audio-visual

fgnt/nara_wpe

jefflai108/Contrastive-Predictive-Coding-PyTorch

TaoRuijie/TalkNet-ASD

mpariente/pystoi

kkoutini/PaSST

vb000/Waveformer

nryant/dscore

xuchenglin28/speaker_extraction

youngwoo-yoon/youtube-gesture-dataset

merlresearch/cocktail-fork-separation

zcxu-eric/AVA-AVD

zexupan/MuSE

Jiang-Yidi/FlatTrajectoryDistillation_FTD

zexupan/avse_hybrid_loss

zexupan/reentry

zexupan/USEV

zexupan/ImagineNET

zexupan/seg

biji0002/EE4208ComputerVision