n-gram Hidden Markov Model to identify gene names in biological text
Here's my solution to Programming Assignment 1 of the Coursera Natural Language Processing course by Michael Collins
Part I: Unigram tagger
Part II: Trigram tagger with my implementation of the Viterbi algorithm
Part III: Extended HMM tagger that groups words into four informative word classes (Numeric, All Capitals, Last Capital and Rare)