Pinned Repositories
AccentedSpeechRecognition
Experiments on speech recognition robustness to accents and dialects
asteroid
The PyTorch-based audio source separation toolkit for researchers
asv-subtools
An Open Source Tools for Speaker Recognition
attention_keras
Keras Layer implementation of Attention for Sequential models
AttentionIsOFFByOne
Implementation of "Attention Is Off By One" by Evan Miller
audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
Audiomer-PyTorch
A Convolutional Transformer for Keyword Spotting
AudioTagger
Deep Learning Neural Networks Final Project
Qifusion-net
The net mudule of Qifusion-Net: Layer-adapted Stream/Non-stream Model for End-to-End Multi-Accent Speech Recognition
JinmingChe's Repositories
JinmingChe/MTFAA-Net
Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speech Enhancement
JinmingChe/FullSubNet-plus
The official PyTorch implementation of "FullSubNet+: Channel Attention FullSubNet with Complex Spectrograms for Speech Enhancement".
JinmingChe/TaylorSENet
This is the implementation of the paper ''Taylor, Can You Hear Me Now? A Taylor-Unfolding Framework for Monaural Speech Enhancement'', which was accepted by IJCAI-ECAI2022 (Long oral)
JinmingChe/silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector, Language Classifier and Spoken Number Detector
JinmingChe/Uformer
[CVPR 2022] Official repository for the paper "Uformer: A General U-Shaped Transformer for Image Restoration".
JinmingChe/DNN-based-Speech-Enhancement-in-the-frequency-domain
DNN-based SE in the frequency domain using Pytorch. You can test some state-of-the-art networks using T-F masking or spectral mapping method.
JinmingChe/HiFiplusplus-pytorch
HiFi++: a Unified Framework for Neural Vocoding, Bandwidth Extension and Speech Enhancement
JinmingChe/DPCRN_DNS3
Implementation of paper "DPCRN: Dual-Path Convolution Recurrent Network for Single Channel Speech Enhancement"
JinmingChe/asteroid
The PyTorch-based audio source separation toolkit for researchers
JinmingChe/sudo_rm_rf
Code for SuDoRm-Rf networks for efficient audio source separation. SuDoRm-Rf stands for SUccessive DOwnsampling and Resampling of Multi-Resolution Features which enables a more efficient way of separating sources from mixtures.
JinmingChe/DeepFilterNet
Noise supression using deep filtering
JinmingChe/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
JinmingChe/leaf-audio
LEAF is a learnable alternative to audio features such as mel-filterbanks, that can be initialized as an approximation of mel-filterbanks, and then be trained for the task at hand, while using a very small number of parameters.
JinmingChe/PercepNet
(Work In Progress) Unofficial implementation of PercepNet: A Perceptually-Motivated Approach for Low-Complexity, Real-Time Enhancement of Fullband Speech
JinmingChe/awesome-keyword-spotting
This repository is a curated list of awesome Speech Keyword Spotting (Wake-Up Word Detection).
JinmingChe/espnet
End-to-End Speech Processing Toolkit
JinmingChe/DeepXi
Deep Xi: A deep learning approach to a priori SNR estimation implemented in TensorFlow 2/Keras. For speech enhancement and robust ASR.
JinmingChe/deit
Official DeiT repository
JinmingChe/openspeech
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.
JinmingChe/RNNoise_Wrapper
A simple Python wrapper for audio noise reduction RNNoise. Simplifies work with it, adds new trained models and detailed instructions for training.
JinmingChe/voice_activity_detection
Voice Activity Detection based on Deep Learning & TensorFlow
JinmingChe/SE-TFCN
语音增强TFCN论文复现
JinmingChe/Montreal-Forced-Aligner
Command line utility for forced alignment using Kaldi
JinmingChe/THULAC-Python
An Efficient Lexical Analyzer for Chinese
JinmingChe/asv-subtools
An Open Source Tools for Speaker Recognition
JinmingChe/denoiser
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.
JinmingChe/FullSubNet
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
JinmingChe/SDDNet
Coarse implement of the paper "A Simultaneous Denoising and Dereverberation Framework with Target Decoupling", On DNS-2020 dataset, the DNSMOS of first stage is 3.42 and second stage is 3.47.
JinmingChe/SpecAugment
A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain
JinmingChe/g2p
g2p: English Grapheme To Phoneme Conversion