/nmt_tutorial

A tutorial on Neural Machine Translation using the Encoder Decoder architecture with attention.

Primary LanguageJupyter Notebook

This repository is a collection of items I created towards the completion of my senior thesis in the Applied Mathematics Dept. at Loyola Marymount Univeristy under the supervision of Dr. Thomas Laurent. Furthermore, it also contains items from a tutorial I created on the topic stemming from the things I learned in completing this thesis.

The thesis itself revolved around understanding and thoroughly exlaining Neural Machine Translation towards the end goal of creating my own model in Python. The model I created was largely inspired and drew from the PyTorch tutorial "Translation with a Sequence to Sequence Network and Attention". However, this model was enhanced in a number of ways. Most notably, this code allows the data to be processed in mini-batches, adds the option to easily split the dataset into train and test data, and implements a number of other useful and effective enhancements (optional learning rate schedules, ability to handle datasets in different formats, ability to handle unknown words, etc.).

Along with enhancing the code in the PyTorch tutorial to create my own model, I completed a Thesis paper which details the math behind the Encoder Decoder neural network.

Finally, I used the knowledge I gained from this experience to create a tutorial on NMT in Google Colaboration. Google Colaboration offers free access to a GPU for anyone with a Google account. I wrote a Medium article, which is about a 20 minute read, that gives a thorough explanation of the Encoder Decoder structure along with walking through the various parts of the Google Colab code. You can find the Medium article here.