ctwgL

student,interested in speech processing

Shanghai Universitywuhan China

Pinned Repositories

acoustic-interference-cancellation
acoustic interference (echo) cancellation project in summer internship
Language:MATLAB00
AI-Expert-Roadmap
Roadmap to becoming an Artificial Intelligence Expert in 2021
Language:JavaScript0 0 00
annotated_deep_learning_paper_implementations
🧑‍🏫 50! Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
Language:Jupyter Notebook0 0 00
ant-design
An enterprise-class UI design language and React UI library
Language:TypeScript00
ASC_baseline
Language:Python0 1 00
athena-signal
Language:C00
audio
Data manipulation and transformation for audio signal processing, powered by PyTorch
Language:Python0 1 00
VAD
Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.
Language:MATLAB10
webrtc-beamforming
整理出来的webrtc波束模块
Language:C++25 1 516
webrtc_agc2
demo for webrtc agc2
Language:Makefile28 3 211

ctwgL's Repositories

ctwgL/voice-filter
A unofficial Pytorch implementation of Google's VoiceFilter
ctwgL/CMake-Cookbook
:book: 作为对《CMake Cookbook》的中文翻译。
1
ctwgL/CNBF
Complex Neural Beamformer
ctwgL/gpuRIR
Python library for Room Impulse Response (RIR) simulation with GPU acceleration
ctwgL/Ne10
An open optimized software library project for the ARM® Architecture
ctwgL/OMLSA-MCRA
C++ speech enhancement base on OMLSA-MCRA
ctwgL/speechmetrics
A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR
ctwgL/Speech_Signal_Processing_and_Classification
Front-end speech processing aims at extracting proper features from short- term segments of a speech utterance, known as frames. It is a pre-requisite step toward any pattern recognition problem employing speech or audio (e.g., music). Here, we are interesting in voice disorder classification. That is, to develop two-class classifiers, which can discriminate between utterances of a subject suffering from say vocal fold paralysis and utterances of a healthy subject.The mathematical modeling of the speech production system in humans suggests that an all-pole system function is justified [1-3]. As a consequence, linear prediction coefficients (LPCs) constitute a first choice for modeling the magnitute of the short-term spectrum of speech. LPC-derived cepstral coefficients are guaranteed to discriminate between the system (e.g., vocal tract) contribution and that of the excitation. Taking into account the characteristics of the human ear, the mel-frequency cepstral coefficients (MFCCs) emerged as descriptive features of the speech spectral envelope. Similarly to MFCCs, the perceptual linear prediction coefficients (PLPs) could also be derived. The aforementioned sort of speaking tradi- tional features will be tested against agnostic-features extracted by convolu- tive neural networks (CNNs) (e.g., auto-encoders) [4]. The pattern recognition step will be based on Gaussian Mixture Model based classifiers,K-nearest neighbor classifiers, Bayes classifiers, as well as Deep Neural Networks. The Massachussets Eye and Ear Infirmary Dataset (MEEI-Dataset) [5] will be exploited. At the application level, a library for feature extraction and classification in Python will be developed. Credible publicly available resources will be 1used toward achieving our goal, such as KALDI. Comparisons will be made against [6-8].
ctwgL/VAD
Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.
1
ctwgL/CPlusPlusThings
C++那些事
ctwgL/distiller
Neural Network Distiller by Intel AI Lab: a Python package for neural network compression research. https://nervanasystems.github.io/distiller
ctwgL/musco-pytorch
MUSCO: MUlti-Stage COmpression of neural networks
ctwgL/model-compression-and-acceleration-progress
Repository to track the progress in model compression and acceleration
ctwgL/audio
Data manipulation and transformation for audio signal processing, powered by PyTorch
ctwgL/Network-Speed-and-Compression
Network acceleration methods
ctwgL/DCUnet.pytorch
Phase-Aware Speech Enhancement with Deep Complex U-Net
ctwgL/Awesome-pytorch-list
A comprehensive list of pytorch related content on github,such as different models,implementations,helper libraries,tutorials etc.
ctwgL/Model-Compression-Acceleration
Paper list on model compression and acceleration
ctwgL/Awesome-model-compression-and-acceleration
ctwgL/Nonlinear-System-Identification-with-Wavelet-Discrete-Transform
Nonlinear System Identification with Wavelet Discrete Transform
ctwgL/Voice_Activity_Detector
A statistical model-based Voice Activity Detection
ctwgL/Deep-Compression-PyTorch
PyTorch implementation of 'Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding' by Song Han, Huizi Mao, William J. Dally
ctwgL/Pytorch-Quaternion-Neural-Networks
This repository is an update to all previous repositories with an implementation of various Quaternion-valued Neural Networks in PyTorch
ctwgL/rnnoise-models
Trained neural networks and requisite information and data for rnnoise-nu
ctwgL/acoustic-interference-cancellation
acoustic interference (echo) cancellation project in summer internship
ctwgL/pytorch-tensor-decompositions
PyTorch implementation of [1412.6553] and [1511.06530] tensor decomposition methods for convolutional layers.
ctwgL/pytorch-weight-prune
Pytorch version for weight pruning for Murata Group's CREST project
ctwgL/deepbeam
Deep learning based Speech Beamforming
ctwgL/sndfilter
Algorithms for sound filters, like reverb, dynamic range compression, lowpass, highpass, notch, etc
ctwgL/label_smoothing
Corrupted labels and label smoothing