Pinned Repositories
acoustic-simulator
Implementation of audio degradation processes
awesome-diarization
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
awesome-kaldi
This is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )
Background-Matting
Background Matting: The World is Your Green Screen
cppcoro
A library of C++ coroutine abstractions for the coroutines TS
deepcorrect
Text and Punctuation correction with Deep Learning
deepsegment
A sentence segmenter that actually works!
DeepSpeech
A TensorFlow implementation of Baidu's DeepSpeech architecture
dejavu
Audio fingerprinting and recognition in Python
Facial-Similarity-with-Siamese-Networks-in-Pytorch
Implementing Siamese networks with a contrastive loss for similarity learning
mwang-lifesize's Repositories
mwang-lifesize/awesome-diarization
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
mwang-lifesize/awesome-kaldi
This is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )
mwang-lifesize/Background-Matting
Background Matting: The World is Your Green Screen
mwang-lifesize/cppcoro
A library of C++ coroutine abstractions for the coroutines TS
mwang-lifesize/deepcorrect
Text and Punctuation correction with Deep Learning
mwang-lifesize/deepsegment
A sentence segmenter that actually works!
mwang-lifesize/DeepSpeech
A TensorFlow implementation of Baidu's DeepSpeech architecture
mwang-lifesize/dejavu
Audio fingerprinting and recognition in Python
mwang-lifesize/Facial-Similarity-with-Siamese-Networks-in-Pytorch
Implementing Siamese networks with a contrastive loss for similarity learning
mwang-lifesize/frugally-deep
Header-only library for using Keras models in C++.
mwang-lifesize/Generic-Speaker-Verificator
mwang-lifesize/ivector-xvector
Extract xvector and ivector under kaldi
mwang-lifesize/keras-sincnet
Keras (tensorflow) implementation of SincNet (Mirco Ravanelli, Yoshua Bengio - https://github.com/mravanelli/SincNet)
mwang-lifesize/py-kaldi-asr
Some simple wrappers around kaldi-asr intended to make using kaldi's (online) decoders as convenient as possible.
mwang-lifesize/PyTorch_Speaker_Verification
PyTorch implementation of "Generalized End-to-End Loss for Speaker Verification" by Wan, Li et al.
mwang-lifesize/raw-audio-gender-classification
Machine learning experiment to perform gender classification from raw audio.
mwang-lifesize/rnnoise-wasm
rnnoise noise suppression library as a WASM module
mwang-lifesize/simple_bodypix_python
A simple and minimal bodypix inference in python
mwang-lifesize/SincNet
SincNet is a neural architecture for efficiently processing raw audio samples.
mwang-lifesize/Speaker-Identification
A program for automatic speaker identification using deep learning techniques.
mwang-lifesize/Speaker-Identification-Python
Speaker Identification System (upto 100% accuracy); built using Python 2.7 and python_speech_features library
mwang-lifesize/speaker-recognition-3d-cnn
Keras + pyTorch implimentation of "Deep Learning & 3D Convolutional Neural Networks for Speaker Verification"
mwang-lifesize/speaker-recognition-py3
Base on MFCC and GMM(基于MFCC和高斯混合模型的语音识别)
mwang-lifesize/speakerIdentificationNeuralNetworks
⇨ The Speaker Recognition System consists of two phases, Feature Extraction and Recognition. ⇨ In the Extraction phase, the Speaker's voice is recorded and typical number of features are extracted to form a model. ⇨ During the Recognition phase, a speech sample is compared against a previously created voice print stored in the database. ⇨ The highlight of the system is that it can identify the Speaker's voice in a Multi-Speaker Environment too. Multi-layer Perceptron (MLP) Neural Network based on error back propagation training algorithm was used to train and test the system. ⇨ The system response time was 74 µs with an average efficiency of 95%.
mwang-lifesize/SpeakerRecognition_tutorial
Simple d-vector based Speaker Recognition using Pytorch
mwang-lifesize/timit
The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus
mwang-lifesize/VGGVox
VGGVox models for Speaker Identification and Verification trained on the VoxCeleb (1 & 2) datasets
mwang-lifesize/voicemap
Identifying people from small audio fragments
mwang-lifesize/wer_are_we
Attempt at tracking states of the arts and recent results (bibliography) on speech recognition.
mwang-lifesize/zamia-speech
Open tools and data for cloudless automatic speech recognition