/rosetta

State of the art overview for seq2seq models (step for step)

Primary LanguageJupyter Notebook

Rosetta

I found it hard to reproduce state-of-the-art seq2seq models. On the one hand, a lot of papers are without code. And code examples from the web are often outdated (using older versions of tensorflow, keras, python, ...) or incomplete/not-standalong-runnable (missing important details) or full-bloated (with lots of additional stuff) or they only concentrate on one technique but not combining them with others (like attention model but without beam search). This project is my attempt to learn and apply the state-of-the-art techniques step by step.

Roadmap

  • for toy problems, machine translation, summaries and chat botting
  • in Keras first, then Tensorflow, maybe PyTorch / Tensorflow Hub / tf.keras
  • from ground up simple model adding more higher level approaches (bytepairencodings, beam search, attentions, ...)

I'm not explaining a lot, I concentrate on implementation details here. There a lot of better tutorials outside to understand seq2seq models and their terminology.

Models step for step:

  1. Simple Model for adding and subtracting numbers end-to-end on chars
  2. Simple Model char-level end-to-end for Machine Translation
  3. Bytepairencoding embeddings instead for Machine Translation
  4. Implementing BeamSearch model
  5. BeamSearch model trained on a larger dataset
  6. Attention model with Tensorflow trained on a larger dataset
  7. Attention model trained on full en-de europarliament dataset
  8. Multiple layers attention model on full en-de dataset

Model weights:

I saved the weights for most models in a Google Drive Folder. In addition the file urls are also included in the notebooks as comments.

Usage / Installation

I'm using Python 3.6 with tensorflow 1.8.0 and keras 2.2.0. For details look into the Pipfile.lock.

I use pipenv to track all dependencies and create a virtualenv. Follow the instruction to install pipenv and then

git clone git@github.com:hanfried/rosetta.git
cd rosetta

pipenv install  --ignore-pipfile  # I haven't freezed the requirements in Pipfile, so it uses the exact versions from Pipfile.lock
pipenv run jupyter notebook

to start a jupyter notebook environment with all required modules installed and running in a virtualenv.

See also