Multihead Attention Seed Specification
egehancosgun opened this issue · 1 comments
egehancosgun commented
Dropout layer inside the multihead attention layer does not take any seed as an argument. This causes non-deterministic outputs. Can you please add this in future releases?
fchollet commented
Thanks for the suggestion. This is now added.