/NLP

An implementation of the algorithms from the boook Speech and Language Processing by Jurafsky and Martin.

Primary LanguageClojure

nlp

A Clojure implementation of the algorithms found in Speech and Language Processing by Daniel Jurafsky and James H Martin.

The purpose of this code base is twofold; To improve my comfortability and skills with Clojure, and to improve my knowledge of algorithms used in Natural Language Processing and Computational Linguistics.

Implementations

Chapter 2

  • Exercise 2.2 implemented ELIZA in src/nlp/eljza.clj

Chapter 4

  • Exercise 4.2 implemented in src/nlp/ngrams.clj
  • Exercise 4.4 implemented in src/nlp/ngrams.clj

Chapter 6

  • Exercise 6.1 implemented in src/nlp/hmm.clj as function forward
  • Exercise 6.2 implemented in src/nlp/hmm.clj as function viterbi

Chapter 13

  • Exercise 13.1 implemented in src/nlp/grammar.clj as function cfg-to-cnf

Usage

I will add some example usage in the near future.

Todo

  • Implement Exercise 3.11 (Minimum Edit Distance)

  • Implement Exercise 4.5 (Good-Turing Discounting)

  • Implement Exercise 4.6 (Katz Backoff)

  • Implement Exercise 4.7 (Compute Perplexity)

  • Implement Exercise 13.2 (CKY algorithm)

  • Implement Test Cases (Learn how first!)

  • Implement Proper Documentation (Learn how first!)

License

Copyright © 2012 Cody Rioux

Distributed under the Eclipse Public License, the same as Clojure.