/NLP-Homeworks

This folder will contain some homeworks proposed by the professor Jan Hajic at Charles University for the course Statistical Methods for Natural Language Processing I, II (NPFL067, NPFL068)

Primary LanguagePython

NLP-Homeworks

Here you can find some homeworks that I did for the course Statistical Methods in Natural Language Processing I and II held by the professor Jan Hajic.
I've always thought that NLP was a fascinating topic and I am glad I found a way to challenge myself by doing some work on this field.

Here you can find the syllabus of the aforementioned courses:

The homeworks

  • Assignment 1: Study the perplexity and entropy of a text.
  • Assignment 2: Use the EM algorithm to get the parameters that tune the probabilities obtained from the training data over the heldout data. Finally, evaluate the model by computing the cross entropy over a separate test set.
  • Assignment 3: Train your own Brill's tagger by defining the template and deciding appropriately the maximum number of rules
  • Assignment 4: Train an HMM Tagger both in a supervised and unsupervised way (Viterbi training and Baum-Welch training, respectively).