Python implementation of simple GMM and HMM models for isolated digit recognition.
This implementation contains 3 models:
- Single Gaussian: Each digit is modeled using a single Gaussian with diagonal covariance.
- Gaussian Mixture Model (GMM): Each digit is modeled using a mixture of Gaussians, initialized by perturbing the single Gaussian model.
- Hidden Markov Model (HMM): Each digit is modeled by an HMM consisting of N states, where the emission probability of each state is a single Gaussian with diagonal covariance.
Disclaimer: This is an educational implementation and is not expected to be high-performance.
To install for usage:
pip install git+https://github.com/desh2608/gmm-hmm-asr.git
To install with tests (for development):
git clone https://github.com/desh2608/gmm-hmm-asr.git
cd gmm-hmm-asr && pip install -e .
pytest
This will run each of the 3 models end-to-end, and take approximately 2-3 minutes.
To train, first create train_data
which should be a list of DataTuple(key,feats,label)
objects.
from gmm_hmm_asr.data import DataTuple
from gmm_hmm_asr.trainers import SingleGaussTrainer
ndim = 40 # dimensionality of features
DIGITS = ['1','2','3','4','5'] # digits to be recognized
sg_model = SingleGaussTrainer(ndim, DIGITS)
sg_model.train(train_data)
For prediction, again create a test_data
list similar to train_data
.
preds = sg_model.predict(test_data)
y_pred = [pred[0] for pred in preds] # predicted labels
y_ll = [pred[1] for pred in preds] # maximum log-likelihood
from gmm_hmm_asr.data import DataTuple
from gmm_hmm_asr.trainers import GMMTrainer
ndim = 40 # dimensionality of features
ncomp = 8 # number of Gaussian components
niter = 10 # number of training iterations
DIGITS = ['1','2','3','4','5'] # digits to be recognized
gmm_model = GMMTrainer(ndim, ncomp, DIGITS)
gmm_model.train(train_data, niter)
preds = gmm_model.predict(test_data)
3. Hidden Markov Model
from gmm_hmm_asr.data import DataTuple
from gmm_hmm_asr.trainers import HMMTrainer
ndim = 40 # dimensionality of features
nstate = 5 # number of HMM states
niter = 10 # number of training iterations
DIGITS = ['1','2','3','4','5'] # digits to be recognized
hmm_model = GMMTrainer(ndim, nstate, DIGITS)
hmm_model.train(train_data, niter)
preds = hmm_model.predict(test_data)
If you find any bugs, please raise an Issue or contact draj@cs.jhu.edu
.