/UCSD-CSE256-Statistical-NLP

This course will explore statistical techniques for the automatic analysis of natural language data. Specific topics covered include: probabilistic language models, which define probability distributions over text sequences; text classification; sequence models; parsing sentences into syntactic representations; machine translation, and machine reading.

Primary LanguageJavaScript

UCSD-CSE256-Statistical-NLP

Natural language processing (NLP) is a field of AI which aims to equip computers with the ability to intelligently process natural (human) language. This course will explore statistical techniques for the automatic analysis of natural language data. Specific topics covered include: probabilistic language models, which define probability distributions over text sequences; text classification; sequence models; parsing sentences into syntactic representations; machine translation, and machine reading.

Assignments Topics Requirement Project Report Score
1. Comparing Language Models Language Model + Trigram + Smoothing Requirement Report 100/100; 15%
2. Semi-supervised Text Classification Feature Engineering + Semi-supervised learning Requirement Report 100/100; 20%
3. Sequence Tagging Trigram HMM Requirement Report 100/100; 10%
4. Machine Translation IBM Model 1&2 Requirement Report 104/104; 15%