/ModelEditingPapers

Must-read Papers on Model Editing.

MIT LicenseMIT

Model Editing Papers

Awesome License: MIT

Must-read papers on model editing with large language models.

🔔 News

  • 2023-07 We release EasyEdit, an easy-to-use framework to edit Large Language Models.
  • 2023-06 We will provide a tutorial on Editing Large Language Models at AACL 2023.
  • 2023-05 We release a new analysis paper:"Editing Large Language Models: Problems, Methods, and Opportunities" based on this repository! We are looking forward to any comments or discussions on this topic :)
  • 2022-12 We create this repository to maintain a paper list on Model Editing.

🔍 Contents


🌟 Why Model Editing?

Model Editing is a compelling field of research that focuses on facilitating efficient modifications to the behavior of models, particularly foundation models. The aim is to implement these changes within a specified scope of interest without negatively affecting the model's performance across a broader range of inputs.

Keywords

Model Editing has strong connections with following topics.

  • Updating and fixing bugs for large language models
  • Language models as knowledge base, locating knowledge in large language models
  • Lifelong learning, unlearning and etc.
  • Security and privacy for large language models

📜 Papers

This is a collection of research and review papers of Model Editing. Any suggestions and pull requests are welcome for better sharing of latest research progress.

Overview

Editing Large Language Models: Problems, Methods, and Opportunities. [paper]

Methods

Preserve Parameters

Memory-based
  1. Memory-Based Model Editing at Scale (ICML 2022)
    Eric Mitchell, Charles Lin, Antoine Bosselut, Christopher D. Manning, Chelsea Finn. [paper] [code] [demo]

  2. Fixing Model Bugs with Natural Language Patches. (EMNLP 2022)
    Shikhar Murty, Christopher D. Manning, Scott M. Lundberg, Marco Túlio Ribeiro. [paper] [code]

  3. MemPrompt: Memory-assisted Prompt Editing with User Feedback. (EMNLP 2022)
    Aman Madaan, Niket Tandon, Peter Clark, Yiming Yang. [paper] [code] [page] [video]

  4. Large Language Models with Controllable Working Memory.
    Daliang Li, Ankit Singh Rawat, Manzil Zaheer, Xin Wang, Michal Lukasik, Andreas Veit, Felix Yu, Sanjiv Kumar. [paper]

  5. Can We Edit Factual Knowledge by In-Context Learning?
    Ce Zheng, Lei Li, Qingxiu Dong, Yuxuan Fan, Zhiyong Wu, Jingjing Xu, Baobao Chang. [paper]

  6. Can LMs Learn New Entities from Descriptions? Challenges in Propagating Injected Knowledge
    Yasumasa Onoe, Michael J.Q. Zhang, Shankar Padmanabhan, Greg Durrett, Eunsol Choi. [paper]

  7. MQUAKE: Assessing Knowledge Editing inLanguage Models via Multi-Hop Questions
    Zexuan Zhong, Zhengxuan Wu, Christopher D. Manning, Christopher Potts, Danqi Chen.
    .[paper]

Additional Parameters
  1. Calibrating Factual Knowledge in Pretrained Language Models. (EMNLP 2022)
    Qingxiu Dong, Damai Dai, Yifan Song, Jingjing Xu, Zhifang Sui, Lei Li. [paper] [code]

  2. Transformer-Patcher: One Mistake worth One Neuron. (ICLR 2023)
    Zeyu Huang, Yikang Shen, Xiaofeng Zhang, Jie Zhou, Wenge Rong, Zhang Xiong. [paper] [code]

  3. Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors.
    Thomas Hartvigsen, Swami Sankaranarayanan, Hamid Palangi, Yoon Kim, Marzyeh Ghassemi. [paper] [code]

  4. Neural Knowledge Bank for Pretrained Transformers
    Damai Dai, Wenbin Jiang, Qingxiu Dong, Yajuan Lyu, Qiaoqiao She, Zhifang Sui. [paper]

Change LM's representation space
  1. Inspecting and Editing Knowledge Representations in Language Models
    Evan Hernandez, Belinda Z. Li, Jacob Andreas. [paper] [code]

Modify Parameters

Finetuning
  1. Plug-and-Play Adaptation for Continuously-updated QA. (ACL 2022 Findings)
    Kyungjae Lee, Wookje Han, Seung-won Hwang, Hwaran Lee, Joonsuk Park, Sang-Woo Lee. [paper] [code]

  2. Modifying Memories in Transformer Models.
    Chen Zhu, Ankit Singh Rawat, Manzil Zaheer, Srinadh Bhojanapalli, Daliang Li, Felix Yu, Sanjiv Kumar. [paper]

Meta-learning
  1. Editing Factual Knowledge in Language Models.
    Nicola De Cao, Wilker Aziz, Ivan Titov. (EMNLP 2021) [paper] [code]

  2. Fast Model Editing at Scale. (ICLR 2022)
    Eric Mitchell, Charles Lin, Antoine Bosselut, Chelsea Finn, Christopher D. Manning. [paper] [code] [page]

  3. Editable Neural Networks. (ICLR 2020)
    Anton Sinitsin, Vsevolod Plokhotnyuk, Dmitry V. Pyrkin, Sergei Popov, Artem Babenko. [paper] [code]

Locate and edit
  1. Editing a classifier by rewriting its prediction rules. (NeurIPS 2021)
    Shibani Santurkar, Dimitris Tsipras, Mahalaxmi Elango, David Bau, Antonio Torralba, Aleksander Madry. [paper] [code]

  2. Language Anisotropic Cross-Lingual Model Editing.
    Yang Xu, Yutai Hou, Wanxiang Che. [paper]

  3. Repairing Neural Networks by Leaving the Right Past Behind.
    Ryutaro Tanno, Melanie F. Pradier, Aditya Nori, Yingzhen Li. [paper]

  4. Locating and Editing Factual Associations in GPT. (NeurIPS 2022)
    Kevin Meng, David Bau, Alex Andonian, Yonatan Belinkov. [paper] [code] [page] [video]

  5. Mass-Editing Memory in a Transformer.
    Kevin Meng, Arnab Sen Sharma, Alex Andonian, Yonatan Belinkov, David Bau. [paper] [code] [page] [demo]

  6. Editing models with task arithmetic .
    Gabriel Ilharco, Marco Tulio Ribeiro, Mitchell Wortsman, Ludwig Schmidt, Hannaneh Hajishirzi, Ali Farhadi. [paper]

  7. Editing Commonsense Knowledge in GPT .
    Anshita Gupta, Debanjan Mondal, Akshay Krishna Sheshadri, Wenlong Zhao, Xiang Lorraine Li, Sarah Wiegreffe, Niket Tandon. [paper]

  8. Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs.
    Peter Hase, Mona Diab, Asli Celikyilmaz, Xian Li, Zornitsa Kozareva, Veselin Stoyanov, Mohit Bansal, Srinivasan Iyer. [paper] [code]

  9. Detecting Edit Failures In Large Language Models: An Improved Specificity Benchmark .
    Jason Hoelscher-Obermaier, Julia Persson, Esben Kran, Ioannis Konstas, Fazl Barez. [paper]

  10. Knowledge Neurons in Pretrained Transformers.(ACL 2022)
    Damai Dai , Li Dong, Yaru Hao, Zhifang Sui, Baobao Chang, Furu Wei.[paper] [code] [code by EleutherAI]

  11. LEACE: Perfect linear concept erasure in closed form .
    Nora Belrose, David Schneider-Joseph, Shauli Ravfogel, Ryan Cotterell, Edward Raff, Stella Biderman. [paper]

  12. Transformer Feed-Forward Layers Are Key-Value Memories. (EMNLP 2021)
    Mor Geva, Roei Schuster, Jonathan Berant, Omer Levy. [paper]

  13. Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space.(EMNLP 2022)
    Mor Geva, Avi Caciularu, Kevin Ro Wang, Yoav Goldberg. [paper]

More Related Papers

  1. FRUIT: Faithfully Reflecting Updated Information in Text. (NAACL 2022)
    Robert L. Logan IV, Alexandre Passos, Sameer Singh, Ming-Wei Chang. [paper] [code]

  2. Entailer: Answering Questions with Faithful and Truthful Chains of Reasoning. (EMNLP 2022)
    Oyvind Tafjord, Bhavana Dalvi Mishra, Peter Clark. [paper] [code] [video]

  3. Towards Tracing Factual Knowledge in Language Models Back to the Training Data.
    Ekin Akyürek, Tolga Bolukbasi, Frederick Liu, Binbin Xiong, Ian Tenney, Jacob Andreas, Kelvin Guu. (EMNLP 2022) [paper]

  4. Prompting GPT-3 To Be Reliable.
    Chenglei Si, Zhe Gan, Zhengyuan Yang, Shuohang Wang, Jianfeng Wang, Jordan Boyd-Graber, Lijuan Wang. [paper]

  5. Patching open-vocabulary models by interpolating weights. (NeurIPS 2022)
    Gabriel Ilharco, Mitchell Wortsman, Samir Yitzhak Gadre, Shuran Song, Hannaneh Hajishirzi, Simon Kornblith, Ali Farhadi, Ludwig Schmidt. [paper] [code]

  6. Decouple knowledge from paramters for plug-and-play language modeling (ACL2023 Findings)
    Xin Cheng, Yankai Lin, Xiuying Chen, Dongyan Zhao, Rui Yan.[paper] [code]

  7. Backpack Language Models
    John Hewitt, John Thickstun, Christopher D. Manning, Percy Liang. [paper]

Analysis

  1. Does Localization Inform Editing? Surprising Differences in Causality-Based Localization vs. Knowledge Editing in Language Models.
    Peter Hase, Mohit Bansal, Been Kim, Asma Ghandeharioun. [paper] [code]
  2. Dissecting Recall of Factual Associations in Auto-Regressive Language Models
    Mor Geva, Jasmijn Bastings, Katja Filippova, Amir Globerson. [paper]
  3. Evaluating the Ripple Effects of Knowledge Editing in Language Models
    Roi Cohen, Eden Biran, Ori Yoran, Amir Globerson, Mor Geva. [paper]

🧰 Resources

Benchmarks and Tasks

Edit Type Benchmarks & Datasets
Fact Knowledge ZSRE, CounterFact, CounterFact+,ECBD MQUAKE
Sentiment Convsent
Bias Bias in Bios
Toxic Information RealToxicityPrompts

Tools

EasyEdit: An Easy-to-use Framework to Edit Large Language Models.

FastEdit: Editing large language models within 10 seconds

Contribution

Contributors

Contributing to this paper list

  • There are cases where we miss important works in this field, please contribute to this repo! Thanks for the efforts in advance.

Acknowledgement

  • We would like to express our gratitude to Longhui Yu for the kind reminder about the missing papers.