This repository is used for a language modelling pareto competition at TTIC. I implemented an attention layer with the RNN model. TODO: (Lei Mao suggests another way to implement the attention layer by breaking into the LSTM class.)
This codebase requires Python 3, PyTorch
python main.py --att --att_width 20 # Train a LSTM on PTB with attention layer and set the width of attenion to 20
python generate.py # Generate samples from the trained LSTM model.
This repository contains the code originally forked from the Word-level language modeling RNN that is modified to present attention layer into the model.