/NLP

machine translation, parsing algorithm, coreference resolution system, neural network for named entity recognition

Primary LanguageJava

NLP

This repository shows four programming assignments done for natural language processing.

The first one is building a modern phrase-based statistical machine translation system. It has two main components: implementation of the IBM word alignment models and feature engineering for an MT decoder.

The second one implements a parsing algorithm for a broad coverage statistical treebank parser and test the algorihtm on the WSJ section of the Penn Treebank.

The third one implements two coreference resolution systems: the first system is rule-based while the second one is based on a discriminative statistical classifier.

The fourth one implement a neural network for named entity recognition: including the word embedding layer, the feedforward neural network and the corresponding backpropagation training algorithm.