Pinned Repositories
AEC
AEC-Challenge
AEC Challenge
BSSD
Blind Source Separation and Dereverberation
cdr-dereverb
Coherence-based Dereverberation for Speech Enhancement
Complex-MTASSNet
Multi-Task Audio Source Separation, Two-Stage Model, Complex Domain.
denoiser
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.
DesktopSharing
桌面共享, 支持RTSP转发, RTSP推流, RTMP推流。
DFSMN-Based-Lightweight-Speech-Enhancement
Deep Feedforward sequential memory networks(FSMN)
directional-sparse-filtering-tf
Python Implementation for Directional Sparse Filtering with Tensorflow/Keras
eusipco2019
The code used for EUSIPCO 2019. The latest version is available in SoundSourceSeparation repository.
910882575's Repositories
910882575/cdr-dereverb
Coherence-based Dereverberation for Speech Enhancement
910882575/BirdSoundsDenoising
910882575/Causal-U-Net
unofficial PyTorch implementation of 《A Causal U-net based Neural Beamforming Network for Real-Time Multi-Channel Speech Enhancement》
910882575/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
910882575/AQUA-Tk
AQUA-Tk = Audio QUality Assessment-Toolkit. (In development)
910882575/AudioLDM2
Text-to-Audio/Music Generation
910882575/Child-ASR-Paper
A list of papers for child ASR
910882575/DeepFilterNet
Noise supression using deep filtering
910882575/DSP-Digital-Audio-Processors-in-MATLAB
Digital audio processors such as a compressor/limiter, expander/gate, phase vocoder, multi-tap delay, flanger, reverb, dereverb, and others.
910882575/FilterBanks_FastPythonImplementation
Filter Banks, Fast Python Implementation
910882575/FreeVC
FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion
910882575/groove2groove
Code for "Groove2Groove: One-Shot Music Style Transfer with Supervision from Synthetic Data"
910882575/HierSpeechpp
The official implementation of HierSpeech++
910882575/LLVC
910882575/MNN
MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba
910882575/MNNKit
MNNKit is a collection of AI solutions for mobile developers, powered by MNN engine.
910882575/MultichannelAcousticEchoCancellation
910882575/odas
ODAS: Open embeddeD Audition System
910882575/rVAD
Matlab and Python libraries for an unsupervised method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsupervised Segment-Based Robust Voice Activity Detection Method.
910882575/SDrecord
910882575/so-vits-svc
SoftVC VITS Singing Voice Conversion
910882575/so-vits-svc-4.0-v2
SoftVC VITS Singing Voice Conversion
910882575/test
910882575/TinyNeuralNetwork
TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.
910882575/traditional-speech-enhancement
语音增强传统方法
910882575/Using-spectral-analysis-and-mapping-to-enhance-the-harmonicity-of-a-sound
Research MATLAB Project which analyses Inharmonic sounds, tries to find its most likely fundamental frequency and harmonic template, and performs spectral mapping to make it sound more harmonic while retaining most of its sound quality.
910882575/versatile_audio_super_resolution
Versatile audio super resolution (any -> 48kHz) with AudioSR.
910882575/visqol
Perceptual Quality Estimator for speech and audio
910882575/voicefixer
General Speech Restoration
910882575/voicefixer2
The second generation of VoiceFixer, a toolkit for general speech restoration. *Not affiliated with the original VoiceFixer repo*