The accompanying code for "Transformer Feed-Forward Layers Are Key-Value Memories". Mor Geva, Roei Schuster, Jonathan Berant, and Omer Levy. EMNLP, 2021.
Primary LanguagePythonMIT LicenseMIT
No one’s star this repository yet.