Pzhang266
Universal Audio Processing (denoise, source separation, dereverbration ...)
Institute of Automation Chinese Academy of Sciences (CASIA)China Beijing
Pinned Repositories
acoustic-scene-analysis-with-multihead-self-attention
This repo contains implementation of the paper "Acoustic Scene Analysis With Multihead Self Attention" by Weimin Wang, Weiran Wang, Ming Sun, Chao Wang from Amazon Alexa team
AEC-Challenge
AEC Challenge
AEC3
AEC3 Extracted From WebRTC
audiosetdl
Scripts for download AudioSet
av-se
Deep-Learning-Based Audio-Visual Speech Enhancement and Separation
avobjects
Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"
Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
Awesome-Speech-Enhancement
A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.
coder2gwy
互联网首份程序员考公指南,由3位已经进入体制内的前大厂程序员联合献上。
Optical-Flow-Guided-Feature
Implementation Code of the paper Optical Flow Guided Feature, CVPR 2018
Pzhang266's Repositories
Pzhang266/avobjects
Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"
Pzhang266/Optical-Flow-Guided-Feature
Implementation Code of the paper Optical Flow Guided Feature, CVPR 2018
Pzhang266/acoustic-scene-analysis-with-multihead-self-attention
This repo contains implementation of the paper "Acoustic Scene Analysis With Multihead Self Attention" by Weimin Wang, Weiran Wang, Ming Sun, Chao Wang from Amazon Alexa team
Pzhang266/audiosetdl
Scripts for download AudioSet
Pzhang266/av-se
Deep-Learning-Based Audio-Visual Speech Enhancement and Separation
Pzhang266/Awesome-Speech-Enhancement
A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.
Pzhang266/DeepComplexCRN
Pzhang266/dnn_wpe
Pzhang266/hifi-gan
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Pzhang266/Independent_Component_Analysis
From scratch Python implementation of the fast ICA algorithm.
Pzhang266/lip-reading-deeplearning
:unlock: Lip Reading - Cross Audio-Visual Recognition using 3D Architectures
Pzhang266/LipNet-PyTorch
The state-of-art PyTorch implementation of the method described in the paper "LipNet: End-to-End Sentence-level Lipreading" (https://arxiv.org/abs/1611.01599)
Pzhang266/MTAdam
MTAdam: Automatic Balancing of Multiple Training Loss Terms
Pzhang266/pyflow
Fast, accurate and easy to run dense optical flow with python wrapper
Pzhang266/pytorch-revgrad
A minimal pytorch package implementing a gradient reversal layer.
Pzhang266/sEMG_DeepLearning
sEMG-based gesture recognition using deep learnig
Pzhang266/speechbrain
A PyTorch-based Speech Toolkit
Pzhang266/TCDTIMITprocessing
processing and extracting of face and mouth image files out of the TCDTIMIT database
Pzhang266/voicefilter
Unofficial PyTorch implementation of Google AI's VoiceFilter system
Pzhang266/WASE
PyTorch implementation of WASE described in our paper "Wase: Learning When to Attend for Speaker Extraction in Cocktail Party Environments"