jingxuan9862's Stars
vb000/LookOnceToHear
A novel human-interaction method for real-time speech extraction on headphones.
ewan-xu/pyaec
simple and efficient python implemention of a series of adaptive filters. including time domain adaptive filters(lms、nlms、rls、ap、kalman)、nonlinear adaptive filters(volterra filter、functional link adaptive filters)、frequency domain adaptive filters(frequency domain adaptive filter、frequency domain kalman filter) for acoustic echo cancellation.
magenta/mt3
MT3: Multi-Task Multitrack Music Transcription
jzi040941/PercepNet
Unofficial implementation of PercepNet: A Perceptually-Motivated Approach for Low-Complexity, Real-Time Enhancement of Fullband Speech
google/visqol
Perceptual Quality Estimator for speech and audio
webdataset/webdataset
A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.
microsoft/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
jingxuan9862/PaddleSpeech
An Easy-to-use Speech Toolkit including SOTA ASR pipeline, influential TTS with text frontend and End-to-End Speech Simultaneous Translation.
PaddlePaddle/PaddleSpeech
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
FederatedAI/FATE
An Industrial Grade Federated Learning Framework
bytedance/byteps
A high performance and generic framework for distributed DNN training
qiuqiangkong/panns_transfer_to_gtzan
jameslyons/python_speech_features
This library provides common speech features for ASR including MFCCs and filterbank energies.
cvondrick/soundnet
SoundNet: Learning Sound Representations from Unlabeled Video. NIPS 2016
qiuqiangkong/audioset_tagging_cnn
deezer/spleeter
Deezer source separation library including pretrained models.
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
abisee/pointer-generator
Code for the ACL 2017 paper "Get To The Point: Summarization with Pointer-Generator Networks"
google/sparrowhawk
wenet-e2e/WeTextProcessing.deprecated
speechio/chinese_text_normalization
Chinese text normalization for speech processing
BUTSpeechFIT/speakerbeam
magenta/ddsp
DDSP: Differentiable Digital Signal Processing
nanahou/Awesome-Speech-Enhancement
A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.
asteroid-team/asteroid
The PyTorch-based audio source separation toolkit for researchers
nay0648/unified2021
A UNIFIED SPEECH ENHANCEMENT FRONT-END FOR ONLINE DEREVERBERATION, ACOUSTIC ECHO CANCELLATION, AND SOURCE SEPARATION
etzinis/sudo_rm_rf
Code for SuDoRm-Rf networks for efficient audio source separation. SuDoRm-Rf stands for SUccessive DOwnsampling and Resampling of Multi-Resolution Features which enables a more efficient way of separating sources from mixtures.
clovaai/voxceleb_trainer
In defence of metric learning for speaker recognition
speechbrain/speechbrain
A PyTorch-based Speech Toolkit
lenovo-voice/THE-2020-PERSONALIZED-VOICE-TRIGGER-CHALLENGE-BASELINE-SYSTEM