Transformer version Self-Attention (q, k, v) in Keras
Primary LanguagePython
Transformer version Self-Attention (q, k, v) in Keras 2.0 with masking supported