/en_ja_translator_pytorch

English to Japanese Translator by PyTorch 🙊 (Transformer from scratch)

Primary LanguagePythonMIT LicenseMIT

English to Japanese Translator by pytorch 🙊 (Transformer from scratch)

Overview

  • English to Japanese translator by Pytorch.
  • The neural network architecture is Transformer.
  • The layers for Transfomer are implemented from scratch by pytorch. (you can find them under layers/transformer/)
  • Parallel corpus(dataset) is kftt.

Transformer

image

  • Transformer is a neural network model proposed in the paper ‘Attention Is All You Need’

  • As the paper's title said, transformer is a model based on Attention mechanism. Transformer does not use recursive calculation when training like RNN,LSTM

  • Many of the models that have achieved high accuracy in various tasks in the NLP domain in recent years, such as BERT, GPT-3, and XLNet, have a Transformer-based structure.

Requirements

Setup

Install dependencies & create a virtual environment in project by running:

$ poetry install

set PYTHONPATH

export PYTHONPATH="$(pwd)"

Download & unzip parallel corpus(kftt) by running:

$ poetry run python ./utils/download.py

Directories

The directory structure is as below.

.
├── const
│   └── path.py
├── corpus
│   └── kftt-data-1.0
├── figure
├── layers
│   └── transformer
│       ├── Embedding.py
│       ├── FFN.py
│       ├── MultiHeadAttention.py
│       ├── PositionalEncoding.py
│       ├── ScaledDotProductAttention.py
│       ├── TransformerDecoder.py
│       └── TransformerEncoder.py
├── models
│   ├── Transformer.py
│   └── __init__.py
├── mypy.ini
├── pickles
│   └── nn/
├── poetry.lock
├── poetry.toml
├── pyproject.toml
├── tests
│   ├── conftest.py
│   ├── layers/
│   ├── models/
│   └── utils/
├── train.py
└── utils
    ├── dataset/
    ├── download.py
    ├── evaluation/
    └── text/

How to run

You can train model by running:

$ poetry run python train.py

epoch: 1
--------------------Train--------------------

train loss: 10.104473114013672, bleu score: 0.0,iter: 1/4403

train loss: 9.551202774047852, bleu score: 0.0,iter: 2/4403

train loss: 8.950608253479004, bleu score: 0.0,iter: 3/4403

train loss: 8.688143730163574, bleu score: 0.0,iter: 4/4403

train loss: 8.4220552444458, bleu score: 0.0,iter: 5/4403

train loss: 8.243291854858398, bleu score: 0.0,iter: 6/4403

train loss: 8.187620162963867, bleu score: 0.0,iter: 7/4403

train loss: 7.6360859870910645, bleu score: 0.0,iter: 8/4403

....
  • For each epoch, the model at that point is saved under pickles/nn/
  • When the training is finished, loss.png is saved under figure/

Reference

Licence

MIT