wxue-audio's Stars
fishaudio/fish-speech
SOTA Open Source TTS
facebookresearch/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
hhguo/EA-SVC
An implement of "Phonetic Posteriorgrams based Many-to-Many Singing Voice Conversion via Adversarial Training"
mathigatti/DeepSingingSynthesizer
Extension of Sinsy-NG using deep learning models for voice conversion in order to synthesize good and realistic vocals.
jim-schwoebel/voice_datasets
🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).
wenet-e2e/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
Samsung/ONE
On-device Neural Engine
pyannote/pyannote-metrics
A toolkit for reproducible evaluation, diagnostic, and error analysis of speaker diarization systems
cpuimage/rnnoise
Recurrent neural network for audio noise reduction
xiph/rnnoise
Recurrent neural network for audio noise reduction
jzi040941/PercepNet
Unofficial implementation of PercepNet: A Perceptually-Motivated Approach for Low-Complexity, Real-Time Enhancement of Fullband Speech
facebookresearch/denoiser
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.
vimalmanohar/kaldi
Fork of the official kaldi.
asteroid-team/asteroid
The PyTorch-based audio source separation toolkit for researchers
schmiph2/pysepm
Python implementation of performance metrics in Loizou's Speech Enhancement book
kaituoxu/Conv-TasNet
A PyTorch implementation of Conv-TasNet described in "TasNet: Surpassing Ideal Time-Frequency Masking for Speech Separation" with Permutation Invariant Training (PIT).
naplab/Conv-TasNet
wangkenpu/Conv-TasNet-PyTorch
A PyTorch implementation of Conv-TasNet
WenzheLiu-Speech/awesome-speech-enhancement
speech enhancement\speech seperation\sound source localization
microsoft/DNS-Challenge
This repo contains the scripts, models, and required files for the Deep Noise Suppression (DNS) Challenge.
JoergFranke/phoneme_recognition
Phoneme Recognition using RecNet
hyperconnect/TC-ResNet
Code for Temporal Convolution for Real-time Keyword Spotting on Mobile Devices
ododoyo/EHNet
A neural network consist of cnn and lstm for speech enhancement
vbelz/Speech-enhancement
Deep learning for audio denoising
haoxiangsnr/Wave-U-Net-for-Speech-Enhancement
Implement Wave-U-Net by PyTorch, and migrate it to the speech enhancement.
eesungkim/Speech_Enhancement_DNN_NMF
Speech Enhancement based on DNN (Spectral-Mapping, TF-Masking), DNN-NMF, NMF
Tony607/Keras_Deep_Clustering
How to do Unsupervised Clustering with Keras
Tony607/Keras-Trigger-Word
How to do Real Time Trigger Word Detection with Keras | DLology
kaituoxu/TasNet
A PyTorch implementation of Time-domain Audio Separation Network (TasNet) with Permutation Invariant Training (PIT) for speech separation.
kaituoxu/Speech-Transformer
A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.