java programs for NLP
p1: Given a training corpus, estimate bigram language models with proper smoothings(good turing
and lapras
) and use the models to find mis-spellings in a test corpus.
p2: Given a POS-tagged (labeled/training) corpus and un-tagged (unlabeled/test) corpus, train an HMM
model to predict POS tags on the test corpus. Implemented viterbi, forward, backward, EM.
p3: Implement CYK
algorithm.