Why does the queries are normalized while keys not when using multihead_attention?
feynmanma7 opened this issue · 0 comments
feynmanma7 commented
# Self-attention
self.seq = multihead_attention(queries=normalize(self.seq),
keys=self.seq,
num_units=args.hidden_units,
num_heads=args.num_heads,
dropout_rate=args.dropout_rate,
is_training=self.is_training,
causality=True,
scope="self_attention")
https://github.com/kang205/SASRec/blob/e3738967fddab206d6eeb4fda433e7a7034dd8b1/model.py#L54
Thank you!