/Transformer

Transformers Notebooks

Primary LanguageJupyter Notebook

Language-translation-with-transformer-model

English - French Translator using Transformer with Tensorflow.

Author: Alvaro Henriquez

Introduction

I undertook this project in order to better understand how Transformer Models work. I became interested in the technology when I set out to work on a language translation model using an RNN. However, in learning about RNNs and LSTMs I came across Transformers. As I started to learn more I realiazed that I needed shift to this model for my project.

Since the paper Attention is all you need was published in 2017, introducing transformers, they and their many variants have become the models of choice for Natural Language Processing - NLP. They are used for to solve many types sequence to sequence problems including language translation, information retrieval, text classification, document summarization, image captioning, and genome analysis. More recently they are showing great results in image recognition and object detection.

The data

The dataset comes from the European Parliament Proceedings Parallel Corpus 1996-2011 found at the Statistical Machine Translation website. Specifically, we use the French-English parallel corpus .

Requirements

This project was created using Google Colab. All of the required libraries are included in the Colab environment. You will need a Google account in order to use Colab.

Resources

Attention is all you need - The paper that started it all

The Illustrated Transformer - Blog by Jay Alammar

Transformer model for language understanding - Tensorflow Tutorial

What Are Transformer Models in Machine Learning? -Blog by Rahul Agarwal