Model Editing Papers
Must-read papers on model editing with large language models.
🔔 News
- 2023-07 We release EasyEdit, an easy-to-use framework to edit Large Language Models.
- 2023-06 We will provide a tutorial on Editing Large Language Models at AACL 2023.
- 2023-05 We release a new analysis paper:"Editing Large Language Models: Problems, Methods, and Opportunities" based on this repository! We are looking forward to any comments or discussions on this topic :)
- 2022-12 We create this repository to maintain a paper list on Model Editing.
🔍 Contents
🌟 Why Model Editing?
Model Editing is a compelling field of research that focuses on facilitating efficient modifications to the behavior of models, particularly foundation models. The aim is to implement these changes within a specified scope of interest without negatively affecting the model's performance across a broader range of inputs.
Keywords
Model Editing has strong connections with following topics.
- Updating and fixing bugs for large language models
- Language models as knowledge base, locating knowledge in large language models
- Lifelong learning, unlearning and etc.
- Security and privacy for large language models
📜 Papers
This is a collection of research and review papers of Model Editing. Any suggestions and pull requests are welcome for better sharing of latest research progress.
Overview
Editing Large Language Models: Problems, Methods, and Opportunities. [paper]
Methods
Preserve Parameters
Memory-based
-
Memory-Based Model Editing at Scale (ICML 2022)
Eric Mitchell, Charles Lin, Antoine Bosselut, Christopher D. Manning, Chelsea Finn. [paper] [code] [demo] -
Fixing Model Bugs with Natural Language Patches. (EMNLP 2022)
Shikhar Murty, Christopher D. Manning, Scott M. Lundberg, Marco Túlio Ribeiro. [paper] [code] -
MemPrompt: Memory-assisted Prompt Editing with User Feedback. (EMNLP 2022)
Aman Madaan, Niket Tandon, Peter Clark, Yiming Yang. [paper] [code] [page] [video] -
Large Language Models with Controllable Working Memory.
Daliang Li, Ankit Singh Rawat, Manzil Zaheer, Xin Wang, Michal Lukasik, Andreas Veit, Felix Yu, Sanjiv Kumar. [paper] -
Can We Edit Factual Knowledge by In-Context Learning?
Ce Zheng, Lei Li, Qingxiu Dong, Yuxuan Fan, Zhiyong Wu, Jingjing Xu, Baobao Chang. [paper] -
Can LMs Learn New Entities from Descriptions? Challenges in Propagating Injected Knowledge
Yasumasa Onoe, Michael J.Q. Zhang, Shankar Padmanabhan, Greg Durrett, Eunsol Choi. [paper] -
MQUAKE: Assessing Knowledge Editing inLanguage Models via Multi-Hop Questions
Zexuan Zhong, Zhengxuan Wu, Christopher D. Manning, Christopher Potts, Danqi Chen.
.[paper]
Additional Parameters
-
Calibrating Factual Knowledge in Pretrained Language Models. (EMNLP 2022)
Qingxiu Dong, Damai Dai, Yifan Song, Jingjing Xu, Zhifang Sui, Lei Li. [paper] [code] -
Transformer-Patcher: One Mistake worth One Neuron. (ICLR 2023)
Zeyu Huang, Yikang Shen, Xiaofeng Zhang, Jie Zhou, Wenge Rong, Zhang Xiong. [paper] [code] -
Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors.
Thomas Hartvigsen, Swami Sankaranarayanan, Hamid Palangi, Yoon Kim, Marzyeh Ghassemi. [paper] [code] -
Neural Knowledge Bank for Pretrained Transformers
Damai Dai, Wenbin Jiang, Qingxiu Dong, Yajuan Lyu, Qiaoqiao She, Zhifang Sui. [paper]
Change LM's representation space
- Inspecting and Editing Knowledge Representations in Language Models
Evan Hernandez, Belinda Z. Li, Jacob Andreas. [paper] [code]
Modify Parameters
Finetuning
-
Plug-and-Play Adaptation for Continuously-updated QA. (ACL 2022 Findings)
Kyungjae Lee, Wookje Han, Seung-won Hwang, Hwaran Lee, Joonsuk Park, Sang-Woo Lee. [paper] [code] -
Modifying Memories in Transformer Models.
Chen Zhu, Ankit Singh Rawat, Manzil Zaheer, Srinadh Bhojanapalli, Daliang Li, Felix Yu, Sanjiv Kumar. [paper]
Meta-learning
-
Editing Factual Knowledge in Language Models.
Nicola De Cao, Wilker Aziz, Ivan Titov. (EMNLP 2021) [paper] [code] -
Fast Model Editing at Scale. (ICLR 2022)
Eric Mitchell, Charles Lin, Antoine Bosselut, Chelsea Finn, Christopher D. Manning. [paper] [code] [page] -
Editable Neural Networks. (ICLR 2020)
Anton Sinitsin, Vsevolod Plokhotnyuk, Dmitry V. Pyrkin, Sergei Popov, Artem Babenko. [paper] [code]
Locate and edit
-
Editing a classifier by rewriting its prediction rules. (NeurIPS 2021)
Shibani Santurkar, Dimitris Tsipras, Mahalaxmi Elango, David Bau, Antonio Torralba, Aleksander Madry. [paper] [code] -
Language Anisotropic Cross-Lingual Model Editing.
Yang Xu, Yutai Hou, Wanxiang Che. [paper] -
Repairing Neural Networks by Leaving the Right Past Behind.
Ryutaro Tanno, Melanie F. Pradier, Aditya Nori, Yingzhen Li. [paper] -
Locating and Editing Factual Associations in GPT. (NeurIPS 2022)
Kevin Meng, David Bau, Alex Andonian, Yonatan Belinkov. [paper] [code] [page] [video] -
Mass-Editing Memory in a Transformer.
Kevin Meng, Arnab Sen Sharma, Alex Andonian, Yonatan Belinkov, David Bau. [paper] [code] [page] [demo] -
Editing models with task arithmetic .
Gabriel Ilharco, Marco Tulio Ribeiro, Mitchell Wortsman, Ludwig Schmidt, Hannaneh Hajishirzi, Ali Farhadi. [paper] -
Editing Commonsense Knowledge in GPT .
Anshita Gupta, Debanjan Mondal, Akshay Krishna Sheshadri, Wenlong Zhao, Xiang Lorraine Li, Sarah Wiegreffe, Niket Tandon. [paper] -
Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs.
Peter Hase, Mona Diab, Asli Celikyilmaz, Xian Li, Zornitsa Kozareva, Veselin Stoyanov, Mohit Bansal, Srinivasan Iyer. [paper] [code] -
Detecting Edit Failures In Large Language Models: An Improved Specificity Benchmark .
Jason Hoelscher-Obermaier, Julia Persson, Esben Kran, Ioannis Konstas, Fazl Barez. [paper] -
Knowledge Neurons in Pretrained Transformers.(ACL 2022)
Damai Dai , Li Dong, Yaru Hao, Zhifang Sui, Baobao Chang, Furu Wei.[paper] [code] [code by EleutherAI] -
LEACE: Perfect linear concept erasure in closed form .
Nora Belrose, David Schneider-Joseph, Shauli Ravfogel, Ryan Cotterell, Edward Raff, Stella Biderman. [paper] -
Transformer Feed-Forward Layers Are Key-Value Memories. (EMNLP 2021)
Mor Geva, Roei Schuster, Jonathan Berant, Omer Levy. [paper] -
Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space.(EMNLP 2022)
Mor Geva, Avi Caciularu, Kevin Ro Wang, Yoav Goldberg. [paper]
More Related Papers
-
FRUIT: Faithfully Reflecting Updated Information in Text. (NAACL 2022)
Robert L. Logan IV, Alexandre Passos, Sameer Singh, Ming-Wei Chang. [paper] [code] -
Entailer: Answering Questions with Faithful and Truthful Chains of Reasoning. (EMNLP 2022)
Oyvind Tafjord, Bhavana Dalvi Mishra, Peter Clark. [paper] [code] [video] -
Towards Tracing Factual Knowledge in Language Models Back to the Training Data.
Ekin Akyürek, Tolga Bolukbasi, Frederick Liu, Binbin Xiong, Ian Tenney, Jacob Andreas, Kelvin Guu. (EMNLP 2022) [paper] -
Prompting GPT-3 To Be Reliable.
Chenglei Si, Zhe Gan, Zhengyuan Yang, Shuohang Wang, Jianfeng Wang, Jordan Boyd-Graber, Lijuan Wang. [paper] -
Patching open-vocabulary models by interpolating weights. (NeurIPS 2022)
Gabriel Ilharco, Mitchell Wortsman, Samir Yitzhak Gadre, Shuran Song, Hannaneh Hajishirzi, Simon Kornblith, Ali Farhadi, Ludwig Schmidt. [paper] [code] -
Decouple knowledge from paramters for plug-and-play language modeling (ACL2023 Findings)
Xin Cheng, Yankai Lin, Xiuying Chen, Dongyan Zhao, Rui Yan.[paper] [code] -
Backpack Language Models
John Hewitt, John Thickstun, Christopher D. Manning, Percy Liang. [paper]
Analysis
- Does Localization Inform Editing? Surprising Differences in Causality-Based Localization vs. Knowledge Editing in Language Models.
Peter Hase, Mohit Bansal, Been Kim, Asma Ghandeharioun. [paper] [code] - Dissecting Recall of Factual Associations in Auto-Regressive Language Models
Mor Geva, Jasmijn Bastings, Katja Filippova, Amir Globerson. [paper] - Evaluating the Ripple Effects of Knowledge Editing in Language Models
Roi Cohen, Eden Biran, Ori Yoran, Amir Globerson, Mor Geva. [paper]
🧰 Resources
Benchmarks and Tasks
Edit Type | Benchmarks & Datasets |
---|---|
Fact Knowledge | ZSRE, CounterFact, CounterFact+,ECBD MQUAKE |
Sentiment | Convsent |
Bias | Bias in Bios |
Toxic Information | RealToxicityPrompts |
Tools
EasyEdit: An Easy-to-use Framework to Edit Large Language Models.
FastEdit: Editing large language models within 10 seconds
Contribution
Contributors
Contributing to this paper list
- There are cases where we miss important works in this field, please contribute to this repo! Thanks for the efforts in advance.
Acknowledgement
- We would like to express our gratitude to Longhui Yu for the kind reminder about the missing papers.