/DSP_HiddenMarkovModel

Discrete Hidden Markov Model (HMM) Implementation in C++

Primary LanguageC++

Digital Signal Processing: Discrete HMM

  • Discrete Hidden Markov Model Implementation in C++
  • Implemented based on the course DSP offered by NTU: HMM.pdf

Environment

  • < g++ [gcc version 8.2.0 (GCC)] > (Tested)
  • < g++ [gcc version 7.3.0 (GCC)] > (Tested)
  • < g++ [gcc version 4.2.1 (GCC)] > (Tested)
  • < Python 3.6+ > (Optional - for plot.py)
  • < Matplotlib 2.2.3 > (Optional - for plot.py)

Algorithm

  • Initialization:
    • Set θ = ( A , B , π ) with initial conditions (model_init.txt)
  • Forward Procedure:
    • Compute αi(t) = P( O1=o1, ..., Ot=ot, qt=i | θ ) recursively, the probability of seeing the observation sequence o1, o2, ..., ot and being in state i at time t.
  • Backward Procedure:
    • Compute βi(t) = P( Ot+1=ot+1, ..., OT=oT | qt=i, θ ) recursively, the probability of the ending partial observation sequence ot+1, ..., oT given starting state i at time t.
  • Accumulation Procedure:
    • Calculate the temporary variables, according to Bayes' theorem.
    • Gamma: the probability of being in state i at time t given the observed sequence O and the parameters θ.
    • Epsilon: the probability of being in state i and j at times t and t+1 respectively given the observed sequence O and parameters θ.
  • Update Procedure:
    • Parameters of the hidden Markov model θ can now be updated: ( A , B , π ) = ( A^ , B^ , π^ )
  • A dynamic programming algorithm for finding the most likely sequence of hidden states, that results in a sequence of observed events.
  • Given a hidden Markov model (HMM) with state space Q, initial probabilities πi of being in state i and transition probabilities a(i,j) of transitioning from state i to state j. Say we observe outputs o1, ..., oT. The most likely state sequence q1, ..., qT that produces the observations is given by the Viterbi relations.
  • This algorithm generates a path Q = ( q1, q2, ..., qT ), which is a sequence of states qt ∈ Q = { q1, q2, ..., qK } that generate the observations O = ( o1, o2, ..., oT ) with on ∈ O = { o1, o2, ..., oN }, N being the count of observations.

File Description

.
├── src/
|   ├── Makefile                g++ compiler make file
|   ├── hmm.h                   HMM implementation
|   ├── hmm.cpp                 HMM implementation
|   ├── test_hmm.cpp            Testing algorithm implementation
|   ├── train_hmm.cpp           Training algorithm implementation
|   ├── test                    Unix executable binary code for test_hmm.cpp  (See the next "Usage" section for more details)
|   ├── train                   Unix executable binary code for train_hmm.cpp (See the next "Usage" section for more details)
|   ├── plot.py                 Draws the training plot
|   ├── model_01~05.txt         Trained models
|   └── modellist.txt           Model name list
├── data/
|   ├── model_init.txt          Initial model for training
|   ├── seq_model_01~05.txt     Training data (observation sequences)
|   ├── testing_data1.txt       Testing data (observation sequences)
|   ├── testing_answer.txt      Real answer for "testing_data1.txt"
|   ├── testing_result.txt      Model generated answer for "testing_data1.txt"
|   └── testing_data2.txt       Testing data without answer
└── problem_description.pdf     Work Spec
└── Readme.md                   This File

Usage

Train models separately, then test

└── src/
    ├── make clean
    ├── make
    ├── ./train #iteration ../data/model_init.txt ../data/seq_model_01.txt model_01.txt
    ├── ./train #iteration ../data/model_init.txt ../data/seq_model_02.txt model_02.txt
    ├── ./train #iteration ../data/model_init.txt ../data/seq_model_03.txt model_03.txt
    ├── ./train #iteration ../data/model_init.txt ../data/seq_model_04.txt model_04.txt
    ├── ./train #iteration ../data/model_init.txt ../data/seq_model_05.txt model_05.txt
    └── ./test modellist.txt ../data/testing_data1.txt ../data/testing_result1.txt
- #iteration is positive integer, which is the iteration of the Baum-Welch algorithm.

Train all models at once, then test

└── src/
    ├── make clean
    ├── make
    ├── ./train 250 ../data/model_init.txt ../data/ modellist.txt all
    └── ./test modellist.txt ../data/testing_data1.txt ../data/testing_result1.txt
- #iteration is positive integer, which is the iteration of the Baum-Welch algorithm.

Train all models at once, along with the calculation of the HMM's test score in every iteration (Suggest Usage)

└── src/
    ├── make clean
    ├── make
    ├── ./train 250 ../data/model_init.txt ../data/ modellist.txt all test
    └── ./test modellist.txt ../data/testing_data1.txt ../data/testing_result1.txt
- #iteration is positive integer, which is the iteration of the Baum-Welch algorithm.

Draw Training Plot

└── src/
    └── python3 plot.py

Experinment: Iteration v.s. Accuracy Plot

  • Result: Maximum accuracy 0.8708 achieved at the 2560-th iteration
  • Training Plot: