Pinned Repositories
2.5D-Visual-Sound
2.5D visual sound
AAS_enhancement
This repository contains the code and supplementary result for the paper "Unpaired Speech Enhancement by Acoustic and Adversarial Supervision".
Adaptive-MultiSpeaker-Separation
Adaptive and Focusing Neural Layers for Multi-Speaker Separation Problem
Advanced-Deep-Learning-with-Keras
Advanced Deep Learning with Keras, published by Packt
adversarial-robustness-toolbox
This is a library dedicated to adversarial machine learning. Its purpose is to allow rapid crafting and analysis of attacks and defense methods for machine learning models. The Adversarial Robustness Toolbox provides an implementation for many state-of-the-art methods for attacking and defending classifiers. https://developer.ibm.com/code/open/projects/adversarial-robustness-toolbox/
cgan_speechenhancement
A fully convolutional end-to-end speech enhancement system using GANs
manyears
ManyEars Sound Source Localization, Tracking and Separation
nn-irm
A Simple DNN-IRM estimator for speech enhancement
noise-reduction-using-rnn
Implements python programs to train and test a Recurrent Neural Network with Tensorflow
vadnet
Real-time Voice Activity Detection in Noisy Eniviroments using Deep Neural Networks
zhaoforever's Repositories
zhaoforever/athena-signal
zhaoforever/bitsandbytes
Library for 8-bit optimizers and quantization routines.
zhaoforever/bssaec2020
A New Perspective of Auxiliary-Function-Based Independent Component Analysis in Acoustic Echo Cancellation
zhaoforever/DTLN-aec
This Repostory contains the pretrained DTLN-aec model for real-time acoustic echo cancellation.
zhaoforever/EasyComDataset
The Easy Communications (EasyCom) dataset is a world-first dataset designed to help mitigate the *cocktail party effect* from an augmented-reality (AR) -motivated multi-sensor egocentric world view.
zhaoforever/flops-counter.pytorch
Flops counter for convolutional networks in pytorch framework
zhaoforever/HRTF-construction
code for HRTF database construction
zhaoforever/leaf-audio
zhaoforever/model-compression
model compression based on pytorch (1、quantization: 16/8/4/2 bits(dorefa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、ternary/binary value(twn/bnn/xnor-net);2、 pruning: normal、regular and group convolutional channel pruning;3、 group convolution structure;4、batch-normalization folding for quantization)
zhaoforever/music_source_separation
zhaoforever/nnom
A higher-level Neural Network library for microcontrollers.
zhaoforever/openMHA
The open Master Hearing Aid (openMHA)
zhaoforever/paderwasn
Paderwasn is a collection of methods for acoustic signal processing in wireless acoustic sensor networks (WASNs).
zhaoforever/pedalboard
A Python library for adding effects to audio.
zhaoforever/PercepNet
(Work In Progress) Unofficial implementation of PercepNet: A Perceptually-Motivated Approach for Low-Complexity, Real-Time Enhancement of Fullband Speech
zhaoforever/PseudoBinaural_CVPR2021
Codebase for the paper "Visually Informed Binaural Audio Generation without Binaural Audios" (CVPR 2021)
zhaoforever/python-pesq-1
PESQ (Perceptual Evaluation of Speech Quality) Wrapper for Python Users (narrow band and wide band)
zhaoforever/RAdam
On the Variance of the Adaptive Learning Rate and Beyond
zhaoforever/RIR-Generator
Generating room impulse responses
zhaoforever/room-impulse-responses
A list of publicly available room impulse response datasets and scripts to download them.
zhaoforever/s3prl
Self-Supervised Speech Pre-training and Representation Learning Toolkit.
zhaoforever/SepStereo_ECCV2020
Codebase for the paper "Sep-Stereo: Visually Guided Stereophonic Audio Generation by Associating Source Separation" (ECCV2020)
zhaoforever/sofamyroom
Room acoustic simulator with a SOFA file loader.
zhaoforever/speechbrain
A PyTorch-based Speech Toolkit
zhaoforever/speechmetrics
A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR
zhaoforever/Subband-Music-Separation
Pytorch: Channel-wise subband input for better voice and accompaniment separation
zhaoforever/svoice
We provide a PyTorch implementation of the paper Voice Separation with an Unknown Number of Multiple Speakers In which, we present a new method for separating a mixed audio sequence, in which multiple voices speak simultaneously. The new method employs gated neural networks that are trained to separate the voices at multiple processing steps, while maintaining the speaker in each output channel fixed. A different model is trained for every number of possible speakers, and the model with the largest number of speakers is employed to select the actual number of speakers in a given sample. Our method greatly outperforms the current state of the art, which, as we show, is not competitive for more than two speakers.
zhaoforever/Target-sound-event-detection
zhaoforever/unified2021
A UNIFIED SPEECH ENHANCEMENT FRONT-END FOR ONLINE DEREVERBERATION, ACOUSTIC ECHO CANCELLATION, AND SOURCE SEPARATION
zhaoforever/voicefixer
General Speech Restoration