jadore801120/attention-is-all-you-need-pytorch

Attention value is strange

YPatrickW opened this issue · 1 comments

When i train the transfomer, i found the attention values are almost same
Encoder:
[0.10000075, 0.10000038, 0.09999962, 0.10000114, 0.09999923, 0.09999923, 0.1 0.09999847, 0.10000038, 0.10000075, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Decoder:
[0.19999756 0.2000006 0.20000137 0.2000006 0.19999985 0.
0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0.
0. 0. ]
[0.1666655 0.16666678 0.16666868 0.16666678 0.1666674 0.16666487
0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0.
0. 0. ]
[0.14285614 0.14285722 0.14285886 0.14285722 0.14285776 0.14285503
0.14285776 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0.
0. 0. ]
[0.12499904 0.125 0.12500095 0.12499953 0.125 0.12499809
0.125 0.12500238 0. 0. 0. 0.
0. 0. 0. 0. 0. 0.
0. 0. ]

当我训练转让人时,我发现注意值几乎是相同的 编码器: [0.10000075,0.10000038,0.09999962,0.10000114,0.099999923,0.0999999923,0.09999923,0.1099999923,0.1 0.109999999999847,0.100000000000000000000000,000000,000,0,0,0,0,0,0,0,0,0,0,09,0,0,0,0,0,0,0,0,0,,0,0,0,0,0,0,,0, 0,0,0] 解码器: [0.199999756 0.2000006 0.20000137 0.2000006 0.199999985 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] 0.1666674 0.16666487 0.0.0.0.0.0.0.0.0.0.0.0.0.0 . ] [0.14285614 0.14285722 0.14285886 0.14285722 0.14285776 0.14285503 0.14285776 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. ] [0.12499904 0.125 0.12500095 0.12499953 0.125 0.12499809 0.125 0.12500238 0. 0. 0. 0.0.0.0.0.0.0.0.0 . ]

Hi, I also met the same problem. Do you know the cause of this?