/clj-viterbi

Viterbi search in clojure.

Primary LanguageClojureMIT LicenseMIT

clj-viterbi

Build Status

Viterbi decoding in clojure.

Version

Viterbi decoding finds the most likely path through a hidden Markov model (HMM) that generates a series of observations.

Example

Here's an HMM that describes a model where a person can be either healthy or feverish, and can exhibit symptoms based on their health (taken from https://en.wikipedia.org/wiki/Viterbi_algorithm):

Screenshot of Mavelous running in a desktop browser

According to this model, there is a 60% chance of starting out healthy and a 40% chance of starting out sick. Then on any given day, if someone is healthy then they have a 70% of staying healthy the next day and a 30% chance of getting a fever. If someone has a fever, they have a 60% of continuing to have a fever the next day, or a 40% chance of recovering.

If someone is healthy, there is a 10% chance of them being dizzy, a 40% chance of them being cold, and a 50% chance of being normal. If someone has a fever, there is a 60% chance of being dizzy, a 30% chance of being cold, and a 10% chance of being normal.

To define an HMM, you call make-hmm with the following keyword arguments:

Argument Description
:states The possible states.
:start-p A map from state → starting probability of the state.
:trans-p A function F(s1, s2) → probability of transitioning from state s1 to state s2.
:emit-p A function F(s, o) → probability of emitting observation o from state s.

For example,

(ns viterbi-example
    (:require [com.lemonodor.viterbi :as viterbi]))

(def hmm
  (viterbi/make-hmm
   :states [:healthy :fever]
   :start-p {:healthy 0.6, :fever 0.4}
   :trans-p (let [t {:healthy {:healthy 0.7, :fever 0.3},
                     :fever {:healthy 0.4, :fever 0.6}}]
              #((t %1) %2))
   :emit-p (let [t {:healthy {:normal 0.5, :cold 0.4, :dizzy 0.1},
                    :fever {:normal 0.1, :cold 0.3, :dizzy 0.6}}]
             #((t %1) %2)))

Now we can find the sequence of states that is most likely given the sequence of observations normal, cold, dizzy:

(viterbi/viterbi hmm [:normal :cold :dizzy])
;; -> [-1.8204482088348124 [:healthy :healthy :fever]]

The return value is a vector containing the log10-probability of the most likely sequence of states, and the most likely sequence of states. In this case, the most likely sequence of states is healthy, healthy, fever.