Pinned Repositories
2021-ISMIR-MSS-Challenge-CWS-PResUNet
Music Source Separation; Train & Eval & Inference piplines and pretrained models we used for 2021 ISMIR MDX Challenge.
adjustable-real-time-style-transfer
AdjustAutocorrelation
Adjusting for Autocorrelated Errors in Neural Networks for Time Series
AFRCNN-For-Speech-Separation
Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network
asteroid
The PyTorch-based audio source separation toolkit for researchers
AttentionAugmentedConvLSTM
Implementation of TAAConvLSTM and SAAConvLSTM used in "Attention Augmented ConvLSTM for Environment Prediction"
AudioGPT
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
denoiser
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.
FullSubNet
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
nnAudio
Audio processing by using pytorch 1D convolution network
newoneincntk's Repositories
newoneincntk/ClearerVoice-Studio
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
newoneincntk/ConformerSE-Net
A Low Computation Cost Model for Real-Time Speech Enhancement
newoneincntk/CRNet
newoneincntk/d2l-zh
《动手学深度学习》:面向中文读者、能运行、可讨论。中英文版被70多个国家的500多所大学用于教学。
newoneincntk/Demucs-Gui
A GUI for music separation AI demucs
newoneincntk/DiscreteSpeechMetrics
Reference-aware automatic speech evaluation toolkit
newoneincntk/gtcrn
An official implementation of GTCRN, an ultra-lite speech enhancement model.
newoneincntk/HeyGenClone
A simple and open-source analogue of the HeyGen system
newoneincntk/ICASSP-2023-24-Papers
ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!
newoneincntk/IIFC-Net
newoneincntk/INTERSPEECH-2023-Papers
INTERSPEECH 2023 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!
newoneincntk/Mamba-SEUNet
This is the official implement of Mamba-SEUNet: Mamba UNet for Monaural Speech Enhancement
newoneincntk/Mamba-UNet
Mamba-UNet Zoo
newoneincntk/MozartsTouch
Official implementation of Mozart's Touch: A Lightweight Multi-modal Music Generation Framework Based on Pre-Trained Large Models
newoneincntk/MUSE-Speech-Enhancement
Official code for MUSE: Flexible Voiceprint Receptive Fields and Multi-Path Fusion Enhanced Taylor Transformer for U-Net-based Speech Enhancemen
newoneincntk/MyHeyGen
newoneincntk/nnse
NNSE (Neural Network Speech Enhancement) is a speech-denoiser optimized to run on Ambiq's low power platform
newoneincntk/open-universe
Open implementation of UNIVERSE and UNIVERSE++ diffusion-based speech enhancement models.
newoneincntk/resemble-enhance
AI powered speech denoising and enhancement
newoneincntk/RTFS-Net
Official code release for "RTFS-Net: Recurrent time-frequency modelling for efficient audio-visual speech separation", accepted ICLR 2024
newoneincntk/screenshot-to-code
Drop in a screenshot and convert it to clean HTML/Tailwind/JS code
newoneincntk/se-scaling
Model configurations for scaling SE models in the paper "Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement"
newoneincntk/SE-using-SRL-Model
Causal Speech Enhancement Based on a Two-Branch Nested U-Net Architecture Using Self-Supervised Speech Embeddings
newoneincntk/sheet
Speech Human Evaluation Estimation Toolkit (SHEET)
newoneincntk/Sixty-years-of-frequency-domain-monaural-speech-enhancement
newoneincntk/spiking-fullsubnet
Official repository of Spiking-FullSubNet, the Intel N-DNS Challenge Algorithmic Track Winner.
newoneincntk/SPMamba
newoneincntk/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
newoneincntk/voicefixer
General Speech Restoration
newoneincntk/VPIDM
This is official repository of new SOTA diffusion models based method for speech enhancement