gordicaleksa/pytorch-original-transformer
My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing otherwise seemingly hard concepts. Currently included IWSLT pretrained models.
Jupyter NotebookMIT
Issues
- 0
A environment problem
#10 opened by BruceWang01 - 0
- 0
Sorry, but I couldn't understand where is the concatenation layer after the multi head self attention, shouldn't there be?
#8 opened by Domics10 - 2
Error when running "python training_script.py --batch_size 100 --dataset_name IWSLT --language_direction G2E
#6 opened by minertom - 0
sharing weight matrix between the two embedding layers and the pre-softmax linear transformation
#7 opened by nataly-obr - 2
issue when command :python training_script.py --batchsize 2 -- dataset_name IWSLT --language_direction G2E
#4 opened by adamas-v - 0
Frequency in the positional encodings
#5 opened by FAhtisham - 2
- 0