PyTorch implementation of the original transformer paper (Vaswani et al.).
english_dataset.txt
and spanish_dataset.txt
are a selection of sentence pairs from the Tatoeba Project. See the repository Tab-delimited Bilingual Sentence Pairs. We have preprocessed the words to make them lowercase and remove special characters.
- The position of Layer Normalization matters. Check paper Xiong et al.
- Learning rate Scheduler: We are using ADAM with a learning rate of 0.0001.
- Train and test model in a real dataset.
- BLEU metric.
- Beam Search.