nlp
A Clojure implementation of the algorithms found in Speech and Language Processing by Daniel Jurafsky and James H Martin.
The purpose of this code base is twofold; To improve my comfortability and skills with Clojure, and to improve my knowledge of algorithms used in Natural Language Processing and Computational Linguistics.
Implementations
Chapter 2
- Exercise 2.2 implemented ELIZA in src/nlp/eljza.clj
Chapter 4
- Exercise 4.2 implemented in src/nlp/ngrams.clj
- Exercise 4.4 implemented in src/nlp/ngrams.clj
Chapter 6
- Exercise 6.1 implemented in src/nlp/hmm.clj as function forward
- Exercise 6.2 implemented in src/nlp/hmm.clj as function viterbi
Chapter 13
- Exercise 13.1 implemented in src/nlp/grammar.clj as function cfg-to-cnf
Usage
I will add some example usage in the near future.
Todo
-
Implement Exercise 3.11 (Minimum Edit Distance)
-
Implement Exercise 4.5 (Good-Turing Discounting)
-
Implement Exercise 4.6 (Katz Backoff)
-
Implement Exercise 4.7 (Compute Perplexity)
-
Implement Exercise 13.2 (CKY algorithm)
-
Implement Test Cases (Learn how first!)
-
Implement Proper Documentation (Learn how first!)
License
Copyright © 2012 Cody Rioux
Distributed under the Eclipse Public License, the same as Clojure.