ZhihaoDU

Senior Engineer at Alibaba group

Alibaba groupChina

Pinned Repositories

CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Language:Python6.4k 64 528692
FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Language:Python7k 65 1.2k752
FunCodec
FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.
Language:Python370 15 5230
attention-is-all-you-need-pytorch
A PyTorch implementation of the Transformer model in "Attention is All You Need".
Language:Python11
du2020dan
The implementation of our paper "Double adversarial networks for monaural speech enhancement" accepted by INTERSPEECH 2020.
Language:Python7 2 00
du2020kws
This a small footprint robust KWS system which is based on the multi-conditional training, retraining and joint-training. This system includes a small-footprint KWS system and a small-footprint speech enhancement model. We also investigate a compress method for CNN and LSTM.
Language:Python2 2 01
du2022sond
Speaker overlap-aware Neural Diarization
100
food_is_unstopped
Food is unstopped!!!! GO!
Language:Python3 2 00
speech_feature_extractor
Some useful features of speech process, such as MFCC, gammatone filterbank, GFCC, spectrum(power spectrum and log-power spectrum), Amplitude Modulation Spectrum(AMS) and so on.
Language:Python121 2 541
zhihaodu.github.io
Language:HTML20

ZhihaoDU's Repositories

ZhihaoDU/speech_feature_extractor
Some useful features of speech process, such as MFCC, gammatone filterbank, GFCC, spectrum(power spectrum and log-power spectrum), Amplitude Modulation Spectrum(AMS) and so on.
Language:Python121 2 541
ZhihaoDU/du2022sond
Speaker overlap-aware Neural Diarization
100
ZhihaoDU/du2020dan
The implementation of our paper "Double adversarial networks for monaural speech enhancement" accepted by INTERSPEECH 2020.
Language:Python7 2 00
ZhihaoDU/food_is_unstopped
Food is unstopped!!!! GO!
Language:Python3 2 00
ZhihaoDU/du2020kws
This a small footprint robust KWS system which is based on the multi-conditional training, retraining and joint-training. This system includes a small-footprint KWS system and a small-footprint speech enhancement model. We also investigate a compress method for CNN and LSTM.
Language:Python2 2 01
ZhihaoDU/zhihaodu.github.io
Language:HTML20
ZhihaoDU/attention-is-all-you-need-pytorch
A PyTorch implementation of the Transformer model in "Attention is All You Need".
Language:Python11
ZhihaoDU/improved-gan
code for the paper "Improved Techniques for Training GANs"
Language:Python1 1 00
ZhihaoDU/asteroid
The PyTorch-based audio source separation toolkit for researchers || Pretrained models available
Language:Python00
ZhihaoDU/awesome-diarization
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
01
ZhihaoDU/compare_gan
Compare GAN code.
Language:Python00
ZhihaoDU/DDAEC
Language:Python0 1 00
ZhihaoDU/demo_train
ZhihaoDU/espnet
End-to-End Speech Processing Toolkit
Language:Python1 0
ZhihaoDU/FeatureEmbedding
Feature Embedding
ZhihaoDU/FloWaveNet
A Pytorch implementation of "FloWaveNet: A Generative Flow for Raw Audio"
Language:Python1 0
ZhihaoDU/griffin_lim
Implementation of the Griffin and Lim algorithm to recover an audio signal from a magnitude-only spectrogram.
Language:Python1 0
ZhihaoDU/huxpro.github.io
My Blog / Jekyll Themes / PWA
Language:CSS
ZhihaoDU/kaldi_feat_enh
enhancement model for kaldi features
Language:Python
ZhihaoDU/neos_speech_utils
The speech utils may be useful for speech separation, speech enhancement, speech synthesis researchers. Enjoy it.
Language:Python
ZhihaoDU/progressive_growing_of_gans
Progressive Growing of GANs for Improved Quality, Stability, and Variation
Language:Python
ZhihaoDU/pytorch-CycleGAN-and-pix2pix
Image-to-image translation in PyTorch (e.g., horse2zebra, edges2cats, and more)
Language:Python
ZhihaoDU/pytorch-spectral-normalization-gan
Paper by Miyato et al. https://openreview.net/forum?id=B1QRgziT-
Language:Python
ZhihaoDU/stylegan
StyleGAN - Official TensorFlow Implementation
Language:Python1 0
ZhihaoDU/tf-kaldi-speaker
Neural speaker recognition/verification system based on Kaldi and Tensorflow
Language:Python0 0
ZhihaoDU/torch-two-sample
A PyTorch library for two-sample tests
Language:Jupyter Notebook

ZhihaoDU

Pinned Repositories

CosyVoice

FunASR

FunCodec

attention-is-all-you-need-pytorch

du2020dan

du2020kws

du2022sond

food_is_unstopped

speech_feature_extractor

zhihaodu.github.io

ZhihaoDU's Repositories

ZhihaoDU/speech_feature_extractor

ZhihaoDU/du2022sond

ZhihaoDU/du2020dan

ZhihaoDU/food_is_unstopped

ZhihaoDU/du2020kws

ZhihaoDU/zhihaodu.github.io

ZhihaoDU/attention-is-all-you-need-pytorch

ZhihaoDU/improved-gan

ZhihaoDU/asteroid

ZhihaoDU/awesome-diarization

ZhihaoDU/compare_gan

ZhihaoDU/DDAEC

ZhihaoDU/demo_train

ZhihaoDU/espnet

ZhihaoDU/FeatureEmbedding

ZhihaoDU/FloWaveNet

ZhihaoDU/griffin_lim

ZhihaoDU/huxpro.github.io

ZhihaoDU/kaldi_feat_enh

ZhihaoDU/neos_speech_utils

ZhihaoDU/progressive_growing_of_gans

ZhihaoDU/pytorch-CycleGAN-and-pix2pix

ZhihaoDU/pytorch-spectral-normalization-gan

ZhihaoDU/stylegan

ZhihaoDU/tf-kaldi-speaker

ZhihaoDU/torch-two-sample