mfcc

There are 240 repositories under mfcc topic.

ddbourgin/numpy-ml
Machine learning, in numpy
Language:Python14.7k 452 503.7k
aubio/aubio
a library for audio and music analysis
Language:C3.2k 83 338373
libAudioFlux/audioFlux
A library for audio and music analysis, feature extraction.
Language:C2.1k 29 14101
x4nth055/emotion-recognition-using-speech
Building and training Speech Emotion Recognizer that predicts human emotions using Python, Sci-kit learn and Keras
Language:Python537 22 34224
ar1st0crat/NWaves
.NET DSP library with a lot of audio processing functions
Language:C#439 31 6668
SuperKogito/spafe
:sound: spafe: Simplified Python Audio Features Extraction
Language:Python435 9 4676
adamstark/Gist
A C++ Library for Audio Analysis
Language:C++363 26 2374
gionanide/Speech_Signal_Processing_and_Classification
Front-end speech processing aims at extracting proper features from short- term segments of a speech utterance, known as frames. It is a pre-requisite step toward any pattern recognition problem employing speech or audio (e.g., music). Here, we are interesting in voice disorder classification. That is, to develop two-class classifiers, which can discriminate between utterances of a subject suffering from say vocal fold paralysis and utterances of a healthy subject.The mathematical modeling of the speech production system in humans suggests that an all-pole system function is justified [1-3]. As a consequence, linear prediction coefficients (LPCs) constitute a first choice for modeling the magnitute of the short-term spectrum of speech. LPC-derived cepstral coefficients are guaranteed to discriminate between the system (e.g., vocal tract) contribution and that of the excitation. Taking into account the characteristics of the human ear, the mel-frequency cepstral coefficients (MFCCs) emerged as descriptive features of the speech spectral envelope. Similarly to MFCCs, the perceptual linear prediction coefficients (PLPs) could also be derived. The aforementioned sort of speaking tradi- tional features will be tested against agnostic-features extracted by convolu- tive neural networks (CNNs) (e.g., auto-encoders) [4]. The pattern recognition step will be based on Gaussian Mixture Model based classifiers,K-nearest neighbor classifiers, Bayes classifiers, as well as Deep Neural Networks. The Massachussets Eye and Ear Infirmary Dataset (MEEI-Dataset) [5] will be exploited. At the application level, a library for feature extraction and classification in Python will be developed. Credible publicly available resources will be 1used toward achieving our goal, such as KALDI. Comparisons will be made against [6-8].
Language:Python232 10 463
jsingh811/pyAudioProcessing
Audio feature extraction and classification
Language:Python214 3 1937
sp-nitech/SPTK
A suite of speech signal processing tools
Language:C++214 17 624
SuperKogito/Voice-based-gender-recognition
:sound: :boy: :girl:Voice based gender recognition using Mel-frequency cepstrum coefficients (MFCC) and Gaussian mixture models (GMM)
Language:Python192 10 1465
csukuangfj/kaldifeat
Kaldi-compatible online & offline feature extraction with PyTorch, supporting CUDA, batch processing, chunk processing, and autograd - Provide C++ & Python API
Language:C++173 7 3534
ewan-xu/LibrosaCpp
LibrosaCpp is a c++ implemention of librosa to compute short-time fourier transform coefficients,mel spectrogram or mfcc
Language:C++168 4 1040
SuyashMore/MevonAI-Speech-Emotion-Recognition
Identify the emotion of multiple speakers in an Audio Segment
Language:C159 7 1145
sp-nitech/diffsptk
A differentiable version of SPTK
Language:Python153 9 513
tympanix/subsync
Synchronize your subtitles using machine learning
Language:Python138 9 1816
amanbasu/speech-emotion-recognition
Detecting emotions using MFCC features of human speech using Deep Learning
Language:Jupyter Notebook123 4 938
GauravWaghmare/Speaker-Identification
A program for automatic speaker identification using deep learning techniques.
Language:Python84 11 1126
MycroftAI/sonopy
A simple audio feature extraction library
Language:Python78 9 221
ZitengWang/python_kaldi_features
python codes to extract MFCC and FBANK speech features for Kaldi
Language:Python62 7 417
georgid/AlignmentDuration
Lyrics-to-audio-alignement system. Based on Machine Learning Algorithms: Hidden Markov Models with Viterbi forced alignment. The alignment is explicitly aware of durations of musical notes. The phonetic model are classified with MLP Deep Neural Network.
Language:Python55 5 626
mathquis/node-personal-wakeword
Personal wake word detector
Language:JavaScript54 2 58
SuperKogito/Voice-based-speaker-identification
:sound: :boy: :girl: :woman: :man: Speaker identification using voice MFCCs and GMM
Language:Python50 4 114
zafarrafii/Zaf-Python
Zafar's Audio Functions in Python for audio signal analysis: STFT, inverse STFT, mel filterbank, mel spectrogram, MFCC, CQT kernel, CQT spectrogram, CQT chromagram, DCT, DST, MDCT, inverse MDCT.
Language:Jupyter Notebook49 1 111
zafarrafii/Zaf-Matlab
Zafar's Audio Functions in Matlab for audio signal analysis: STFT, inverse STFT, mel filterbank, mel spectrogram, MFCC, CQT kernel, CQT spectrogram, CQT chromagram, DCT, DST, MDCT, inverse MDCT.
Language:Jupyter Notebook48 3 014
aubio/vamp-aubio-plugins
aubio plugins for Vamp
Language:C++47 10 612
supikiti/PNCC
A implementation of Power Normalized Cepstral Coefficients: PNCC
Language:Python47 3 510
k-farruh/speech-accent-detection
The human speaks a language with an accent. A particular accent necessarily reflects a person's linguistic background. The model defines accent based audio record. The result of the model could be used to determine accents and help decrease accents to English learning students and improve accents by training.
Language:Python45 1 29
sheelabhadra/Emergency-Vehicle-Detection
Python implementation of papers on emergency vehicle detection using audio signals
Language:Jupyter Notebook43 6 313
alicex2020/Deep-Learning-Lie-Detection
Use machine learning models to detect lies based solely on acoustic speech information
Language:Jupyter Notebook42 2 410
pulakk/Live-Audio-MFCC
Live Audio MFCC Visualization in the browser using Web Audio API - https://pulakk.github.io/Live-Audio-MFCC/tutorial
Language:JavaScript41 3 06
mechanicalsea/spectra
Spectra extraction tutorials based on torch and torchaudio.
Language:Jupyter Notebook38 5 14
zhengyima/DTW_Digital_Voice_Recognition
基于DTW与MFCC特征进行数字0-9的语音识别，DTW，MFCC，语音识别，中英数据，端点检测，Digital Voice Recognition。
Language:Python37 4 16
dydtjr1128/Speaker-Recognition-using-NN
Speaker Recognition using Neural Network & Linear Regression
Language:Jupyter Notebook33 2 112
FragJage/SpeakerVoiceIdentifier
SpeakerVoiceIdentifier can recognize the voice of a speaker by learning.
Language:C++32 8 515
skaws2003/pytorch-mfcc
A pytorch implementation of MFCC.
Language:Python32 1 31

mfcc

ddbourgin/numpy-ml

aubio/aubio

libAudioFlux/audioFlux

x4nth055/emotion-recognition-using-speech

ar1st0crat/NWaves

SuperKogito/spafe

adamstark/Gist

gionanide/Speech_Signal_Processing_and_Classification

jsingh811/pyAudioProcessing

sp-nitech/SPTK

SuperKogito/Voice-based-gender-recognition

csukuangfj/kaldifeat

ewan-xu/LibrosaCpp

SuyashMore/MevonAI-Speech-Emotion-Recognition

sp-nitech/diffsptk

tympanix/subsync

amanbasu/speech-emotion-recognition

GauravWaghmare/Speaker-Identification

MycroftAI/sonopy

ZitengWang/python_kaldi_features

georgid/AlignmentDuration

mathquis/node-personal-wakeword

SuperKogito/Voice-based-speaker-identification

zafarrafii/Zaf-Python

zafarrafii/Zaf-Matlab

aubio/vamp-aubio-plugins

supikiti/PNCC

k-farruh/speech-accent-detection

sheelabhadra/Emergency-Vehicle-Detection

alicex2020/Deep-Learning-Lie-Detection

pulakk/Live-Audio-MFCC

mechanicalsea/spectra

zhengyima/DTW_Digital_Voice_Recognition

dydtjr1128/Speaker-Recognition-using-NN

FragJage/SpeakerVoiceIdentifier

skaws2003/pytorch-mfcc