mwang-lifesize

Pinned Repositories

acoustic-simulator
Implementation of audio degradation processes
Language:Python0 1 00
awesome-diarization
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
00
awesome-kaldi
This is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )
00
Background-Matting
Background Matting: The World is Your Green Screen
Language:Python00
cppcoro
A library of C++ coroutine abstractions for the coroutines TS
Language:C++0 2 00
deepcorrect
Text and Punctuation correction with Deep Learning
Language:Python00
deepsegment
A sentence segmenter that actually works!
Language:Python00
DeepSpeech
A TensorFlow implementation of Baidu's DeepSpeech architecture
Language:C++00
dejavu
Audio fingerprinting and recognition in Python
Language:Python00
Facial-Similarity-with-Siamese-Networks-in-Pytorch
Implementing Siamese networks with a contrastive loss for similarity learning
Language:Jupyter Notebook00

mwang-lifesize's Repositories

mwang-lifesize/awesome-diarization
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
00
mwang-lifesize/awesome-kaldi
This is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )
00
mwang-lifesize/Background-Matting
Background Matting: The World is Your Green Screen
Language:Python00
mwang-lifesize/cppcoro
A library of C++ coroutine abstractions for the coroutines TS
Language:C++0 2 00
mwang-lifesize/deepcorrect
Text and Punctuation correction with Deep Learning
Language:Python00
mwang-lifesize/deepsegment
A sentence segmenter that actually works!
Language:Python00
mwang-lifesize/DeepSpeech
A TensorFlow implementation of Baidu's DeepSpeech architecture
Language:C++00
mwang-lifesize/dejavu
Audio fingerprinting and recognition in Python
Language:Python00
mwang-lifesize/Facial-Similarity-with-Siamese-Networks-in-Pytorch
Implementing Siamese networks with a contrastive loss for similarity learning
Language:Jupyter Notebook00
mwang-lifesize/frugally-deep
Header-only library for using Keras models in C++.
Language:C++
mwang-lifesize/Generic-Speaker-Verificator
Language:Python
mwang-lifesize/ivector-xvector
Extract xvector and ivector under kaldi
Language:Shell
mwang-lifesize/keras-sincnet
Keras (tensorflow) implementation of SincNet (Mirco Ravanelli, Yoshua Bengio - https://github.com/mravanelli/SincNet)
Language:Python
mwang-lifesize/py-kaldi-asr
Some simple wrappers around kaldi-asr intended to make using kaldi's (online) decoders as convenient as possible.
Language:C++
mwang-lifesize/PyTorch_Speaker_Verification
PyTorch implementation of "Generalized End-to-End Loss for Speaker Verification" by Wan, Li et al.
Language:Python1 0
mwang-lifesize/raw-audio-gender-classification
Machine learning experiment to perform gender classification from raw audio.
Language:Python
mwang-lifesize/rnnoise-wasm
rnnoise noise suppression library as a WASM module
mwang-lifesize/simple_bodypix_python
A simple and minimal bodypix inference in python
mwang-lifesize/SincNet
SincNet is a neural architecture for efficiently processing raw audio samples.
Language:Python1 0
mwang-lifesize/Speaker-Identification
A program for automatic speaker identification using deep learning techniques.
Language:Python
mwang-lifesize/Speaker-Identification-Python
Speaker Identification System (upto 100% accuracy); built using Python 2.7 and python_speech_features library
Language:Python
mwang-lifesize/speaker-recognition-3d-cnn
Keras + pyTorch implimentation of "Deep Learning & 3D Convolutional Neural Networks for Speaker Verification"
Language:Python
mwang-lifesize/speaker-recognition-py3
Base on MFCC and GMM(基于MFCC和高斯混合模型的语音识别)
Language:Python
mwang-lifesize/speakerIdentificationNeuralNetworks
⇨ The Speaker Recognition System consists of two phases, Feature Extraction and Recognition. ⇨ In the Extraction phase, the Speaker's voice is recorded and typical number of features are extracted to form a model. ⇨ During the Recognition phase, a speech sample is compared against a previously created voice print stored in the database. ⇨ The highlight of the system is that it can identify the Speaker's voice in a Multi-Speaker Environment too. Multi-layer Perceptron (MLP) Neural Network based on error back propagation training algorithm was used to train and test the system. ⇨ The system response time was 74 µs with an average efficiency of 95%.
Language:MATLAB
mwang-lifesize/SpeakerRecognition_tutorial
Simple d-vector based Speaker Recognition using Pytorch
Language:Python
mwang-lifesize/timit
The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus
mwang-lifesize/VGGVox
VGGVox models for Speaker Identification and Verification trained on the VoxCeleb (1 & 2) datasets
Language:MATLAB
mwang-lifesize/voicemap
Identifying people from small audio fragments
Language:Python
mwang-lifesize/wer_are_we
Attempt at tracking states of the arts and recent results (bibliography) on speech recognition.
mwang-lifesize/zamia-speech
Open tools and data for cloudless automatic speech recognition
Language:Python