haoyz
PhD student @ CASIA; Main interests include Speech Separation, Auditory Attention, Spiking Neural Network
CASIABeijing China
Pinned Repositories
LISNN
Code for the model presented in the paper "LISNN: Improving Spiking Neural Networks with Lateral Interactions for Robust Object Recognition."
acad-homepage.github.io
AcadHomepage: A Modern and Responsive Academic Personal Homepage
arcface-pytorch
BERT-pytorch
Google AI 2018 BERT pytorch implementation
BP-for-SpikingNN
Spatio-temporal BP for SNNs
Conv-TasNet
sym-STDP-SNN
Code for the model presented in the paper "A Biologically Plausible Supervised Learning Method for Spiking Neural Networks Using the Symmetric STDP Rule"
WASE
Refactoring older code of aispeech-lab/WASE.
haoyz's Repositories
haoyz/sym-STDP-SNN
Code for the model presented in the paper "A Biologically Plausible Supervised Learning Method for Spiking Neural Networks Using the Symmetric STDP Rule"
haoyz/WASE
Refactoring older code of aispeech-lab/WASE.
haoyz/acad-homepage.github.io
AcadHomepage: A Modern and Responsive Academic Personal Homepage
haoyz/Conv-TasNet
haoyz/CSOL_TPAMI2021
haoyz/denoiser
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.
haoyz/dual-path-RNNs-DPRNNs-based-speech-separation
A PyTorch implementation of dual-path RNNs (DPRNNs) based speech separation described in "Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation".
haoyz/espnet
End-to-End Speech Processing Toolkit
haoyz/google-research
Google Research
haoyz/Looking-to-Listen-at-the-Cocktail-Party
Executable code based on Google articles
haoyz/neurst
Neural end-to-end Speech Translation Toolkit
haoyz/performer-pytorch
An implementation of Performer, a linear attention-based transformer, in Pytorch
haoyz/PyContrast
PyTorch implementation of Contrastive Learning methods; List of awesome-contrastive-learning papers
haoyz/pytorch-cifar100
Practice on cifar100(ResNet, DenseNet, VGG, GoogleNet, InceptionV3, InceptionV4, Inception-ResNetv2, Xception, Resnet In Resnet, ResNext,ShuffleNet, ShuffleNetv2, MobileNet, MobileNetv2, SqueezeNet, NasNet, Residual Attention Network, SENet)
haoyz/pytorch-ssd
MobileNetV1, MobileNetV2, VGG based SSD/SSD-lite implementation in Pytorch 1.0 / Pytorch 0.4. Out-of-box support for retraining on Open Images dataset. ONNX and Caffe2 support. Experiment Ideas like CoordConv.
haoyz/quantization-networks-cifar10
A re-implementation of the CVPR19 paper Quantization Networks on CIFAR-10, MNIST and ImageNet
haoyz/rnn-transducer
A Pytorch Implementation of Transducer Model for End-to-End Speech Recognition
haoyz/SincNet
SincNet is a neural architecture for efficiently processing raw audio samples.
haoyz/speaker_extraction
target speaker extraction and verification for multi-talker speech
haoyz/Speech-Separation-Paper-Tutorial
A must-read paper for speech separation based on neural networks
haoyz/Speech-Transformer
A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.
haoyz/spleeter
Deezer source separation library including pretrained models.
haoyz/TAC
transform-average-concatenate (TAC) method for end-to-end microphone permutation and number invariant ad-hoc beamforming.
haoyz/TensorflowASR
haoyz/torchdiffeq
Differentiable ODE solvers with full GPU support and O(1)-memory backpropagation.
haoyz/transformers
🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.
haoyz/uis-rnn
This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.
haoyz/Video-LLaMA
Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
haoyz/video2frame
Yet another easy-to-use tool to extract frames from videos, for deep learning and computer vision.
haoyz/voicefilter
Unofficial PyTorch implementation of Google AI's VoiceFilter system