Natural Language Processing - Project [LM2]

author: Davide Lusuardi 223821

Language Modeling

Implementation of a Recurrent Neural Network Language Model along with a simple version of LSTM and GRU cells.

The model has been trained on the Penn Treebank (PTB) corpus available here.

Requirements

The following Python libraries are required:

  • pytorch
  • pandas

Notebook

The source code file is a Jupyter notebook and can be opened with Jupyter or Google Colab.

Documentation and description of the code is present in the notebook.

The data folder contains a subfolder ptbdataset that contains training, validation and test set of Penn Treebank.