A scratch implementation of the Transformer attention system in TensorFlow/Keras
Primary LanguageJupyter Notebook