/NLP_NER

Named Entity Recognition for Machine Learning course

Primary LanguagePython

NLP_NER

Description

In this design project, we would like to design our sequence labelling model for informal texts using the HMM that we have learned in class. We hope that your se- quence labelling system for informal texts can serve as the very first step towards building a more complex, intelligent sentiment analysis system for social media text. Specifically, we will focus on building two NLP systems – a sentiment analysis system as well as a phrase chunking system for Tweets. The files for this project are in the files EN.zip, FR.zip, as well as SG.zip, CN.zip (the latter two will be available on 9 Nov 2018, after we all have finished part 1). For each dataset, we provide a labelled training set train, an unlabelled development set dev.in, and a labelled development set dev.out. The labelled data has the format of one token per line with token and tag separated by tab and a single empty line that separates sentences.

Part 2

Implement HMM, calculate a, b.

Part 3

Implement Viterbi.

Part 4

Implement HMM with second-order dependencies.

Part 5 (Design Challenge)

Implement Structured perceptron. Please find more information in ML_Report.pdf.