SamLynnEvans/Transformer

A little error in Positional Encoder

MARD1NO opened this issue · 1 comments

pe[pos, i] = math.sin(pos / (10000 ** ((2 * i) / d_model)))
pe[pos, i + 1] = math.cos(pos / (10000 ** ((2 * (i + 1)) / d_model)))

In paper, the cos value should be

pe[pos, i + 1] = math.cos(pos / (10000 ** ((2 * i) / d_model)))

yes, I have the same question.
As the paper (Attention is all you need) says:

pe[pos, 2i] = math.sin(pos / (10000 ** ((2 * i) / d_model)))
pe[pos, 2i + 1] = math.cos(pos / (10000 ** ((2 * i) / d_model)))

where pos is the position and i is the dimension.

Now that the author use for i in range(0, d_model, 2), that means you will jump two steps each iteration.

I think the original code in Embed.py:

for pos in range(max_seq_len):
    for i in range(0, d_model, 2):
        pe[pos, i] = math.sin(pos / (10000 ** ((2 * i) / d_model)))
        pe[pos, i + 1] = math.cos(pos / (10000 ** ((2 * (i + 1)) / d_model)))

replace by:

 for pos in range(max_seq_len):
     for i in range(0, d_model, 2):
         pe[pos, i] = math.sin(pos / (10000 ** (i / d_model)))
         pe[pos, i + 1] = math.cos(pos / (10000 ** (i / d_model)))