/KnowledgeEditingPapers

Must-read Papers on Knowledge Editing for Large Language Models.

MIT LicenseMIT

Knowledge Editing for LLMs Papers

Awesome License: MIT

Must-read papers on knowledge editing for large language models.

🔔 News


🔍 Contents


🌟 Why Knowledge Editing?

Knowledge Editing is a compelling field of research that focuses on facilitating efficient modifications to the behavior of models, particularly foundation models. The aim is to implement these changes within a specified scope of interest without negatively affecting the model's performance across a broader range of inputs.

Keywords

Knowledge Editing has strong connections with following topics.

  • Updating and fixing bugs for large language models
  • Language models as knowledge base, locating knowledge in large language models
  • Lifelong learning, unlearning and etc.
  • Security and privacy for large language models

📜 Papers

This is a collection of research and review papers of Knowledge Editing. Any suggestions and pull requests are welcome for better sharing of latest research progress.

Overview

Editing Large Language Models: Problems, Methods, and Opportunities, EMNLP 2023 Main Conference Paper. [paper]

Editing Large Language Models, AACL 2023 Tutorial. [Github] [Google Drive] [Baidu Pan]

Knowledge Editing for Large Language Models: A Survey
Song Wang, Yaochen Zhu, Haochen Liu, Zaiyi Zheng, Chen Chen, Jundong Li. [paper]

A Survey on Knowledge Editing of Neural Networks
Vittorio Mazzia, Alessandro Pedrani, Andrea Caciolai, Kay Rottmann, Davide Bernardi. [paper]

Knowledge Unlearning for LLMs: Tasks, Methods, and Challenges
Nianwen Si, Hao Zhang, Heyu Chang, Wenlin Zhang, Dan Qu, Weiqiang Zhang. [paper]

Methods

Preserve Parameters

Memory-based
  1. Memory-Based Model Editing at Scale (ICML 2022)
    Eric Mitchell, Charles Lin, Antoine Bosselut, Christopher D. Manning, Chelsea Finn. [paper] [code] [demo]

  2. Fixing Model Bugs with Natural Language Patches. (EMNLP 2022)
    Shikhar Murty, Christopher D. Manning, Scott M. Lundberg, Marco Túlio Ribeiro. [paper] [code]

  3. MemPrompt: Memory-assisted Prompt Editing with User Feedback. (EMNLP 2022)
    Aman Madaan, Niket Tandon, Peter Clark, Yiming Yang. [paper] [code] [page] [video]

  4. Large Language Models with Controllable Working Memory.
    Daliang Li, Ankit Singh Rawat, Manzil Zaheer, Xin Wang, Michal Lukasik, Andreas Veit, Felix Yu, Sanjiv Kumar. [paper]

  5. Can We Edit Factual Knowledge by In-Context Learning?
    Ce Zheng, Lei Li, Qingxiu Dong, Yuxuan Fan, Zhiyong Wu, Jingjing Xu, Baobao Chang. [paper]

  6. Can LMs Learn New Entities from Descriptions? Challenges in Propagating Injected Knowledge
    Yasumasa Onoe, Michael J.Q. Zhang, Shankar Padmanabhan, Greg Durrett, Eunsol Choi. [paper]

  7. MQUAKE: Assessing Knowledge Editing inLanguage Models via Multi-Hop Questions
    Zexuan Zhong, Zhengxuan Wu, Christopher D. Manning, Christopher Potts, Danqi Chen.
    .[paper]

Additional Parameters
  1. Calibrating Factual Knowledge in Pretrained Language Models. (EMNLP 2022)
    Qingxiu Dong, Damai Dai, Yifan Song, Jingjing Xu, Zhifang Sui, Lei Li. [paper] [code]

  2. Transformer-Patcher: One Mistake worth One Neuron. (ICLR 2023)
    Zeyu Huang, Yikang Shen, Xiaofeng Zhang, Jie Zhou, Wenge Rong, Zhang Xiong. [paper] [code]

  3. Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors.
    Thomas Hartvigsen, Swami Sankaranarayanan, Hamid Palangi, Yoon Kim, Marzyeh Ghassemi. [paper] [code]

  4. Neural Knowledge Bank for Pretrained Transformers
    Damai Dai, Wenbin Jiang, Qingxiu Dong, Yajuan Lyu, Qiaoqiao She, Zhifang Sui. [paper]

  5. Rank-One Editing of Encoder-Decoder Models
    Vikas Raunak, Arul Menezes. [paper]

Change LM's representation space
  1. Inspecting and Editing Knowledge Representations in Language Models
    Evan Hernandez, Belinda Z. Li, Jacob Andreas. [paper] [code]

Modify Parameters

Finetuning
  1. Plug-and-Play Adaptation for Continuously-updated QA. (ACL 2022 Findings)
    Kyungjae Lee, Wookje Han, Seung-won Hwang, Hwaran Lee, Joonsuk Park, Sang-Woo Lee. [paper] [code]

  2. Modifying Memories in Transformer Models.
    Chen Zhu, Ankit Singh Rawat, Manzil Zaheer, Srinadh Bhojanapalli, Daliang Li, Felix Yu, Sanjiv Kumar. [paper]

  3. Forgetting before Learning: Utilizing Parametric Arithmetic for Knowledge Updating in Large Language Models
    Shiwen Ni, Dingwei Chen, Chengming Li, Xiping Hu, Ruifeng Xu and Min Yang. [paper]

Meta-learning
  1. Editing Factual Knowledge in Language Models.
    Nicola De Cao, Wilker Aziz, Ivan Titov. (EMNLP 2021) [paper] [code]

  2. Fast Model Editing at Scale. (ICLR 2022)
    Eric Mitchell, Charles Lin, Antoine Bosselut, Chelsea Finn, Christopher D. Manning. [paper] [code] [page]

  3. Editable Neural Networks. (ICLR 2020)
    Anton Sinitsin, Vsevolod Plokhotnyuk, Dmitry V. Pyrkin, Sergei Popov, Artem Babenko. [paper] [code]

Locate and edit
  1. Editing a classifier by rewriting its prediction rules. (NeurIPS 2021)
    Shibani Santurkar, Dimitris Tsipras, Mahalaxmi Elango, David Bau, Antonio Torralba, Aleksander Madry. [paper] [code]

  2. Language Anisotropic Cross-Lingual Model Editing.
    Yang Xu, Yutai Hou, Wanxiang Che. [paper]

  3. Repairing Neural Networks by Leaving the Right Past Behind.
    Ryutaro Tanno, Melanie F. Pradier, Aditya Nori, Yingzhen Li. [paper]

  4. Locating and Editing Factual Associations in GPT. (NeurIPS 2022)
    Kevin Meng, David Bau, Alex Andonian, Yonatan Belinkov. [paper] [code] [page] [video]

  5. Mass-Editing Memory in a Transformer.
    Kevin Meng, Arnab Sen Sharma, Alex Andonian, Yonatan Belinkov, David Bau. [paper] [code] [page] [demo]

  6. Editing models with task arithmetic .
    Gabriel Ilharco, Marco Tulio Ribeiro, Mitchell Wortsman, Ludwig Schmidt, Hannaneh Hajishirzi, Ali Farhadi. [paper]

  7. Editing Commonsense Knowledge in GPT .
    Anshita Gupta, Debanjan Mondal, Akshay Krishna Sheshadri, Wenlong Zhao, Xiang Lorraine Li, Sarah Wiegreffe, Niket Tandon. [paper]

  8. Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs.
    Peter Hase, Mona Diab, Asli Celikyilmaz, Xian Li, Zornitsa Kozareva, Veselin Stoyanov, Mohit Bansal, Srinivasan Iyer. [paper] [code]

  9. Detecting Edit Failures In Large Language Models: An Improved Specificity Benchmark .
    Jason Hoelscher-Obermaier, Julia Persson, Esben Kran, Ioannis Konstas, Fazl Barez. [paper]

  10. Knowledge Neurons in Pretrained Transformers.(ACL 2022)
    Damai Dai , Li Dong, Yaru Hao, Zhifang Sui, Baobao Chang, Furu Wei.[paper] [code] [code by EleutherAI]

  11. LEACE: Perfect linear concept erasure in closed form .
    Nora Belrose, David Schneider-Joseph, Shauli Ravfogel, Ryan Cotterell, Edward Raff, Stella Biderman. [paper]

  12. Transformer Feed-Forward Layers Are Key-Value Memories. (EMNLP 2021)
    Mor Geva, Roei Schuster, Jonathan Berant, Omer Levy. [paper]

  13. Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space.(EMNLP 2022)
    Mor Geva, Avi Caciularu, Kevin Ro Wang, Yoav Goldberg. [paper]

  14. PMET: Precise Model Editing in a Transformer.
    Xiaopeng Li, Shasha Li, Shezheng Song, Jing Yang, Jun Ma, Jie Yu. [paper] [code]

  15. Unlearning Bias in Language Models by Partitioning Gradients. (ACL 2023 Findings)
    Charles Yu, Sullam Jeoung, Anish Kasi, Pengfei Yu, Heng Ji. [paper] [code]

  16. DEPN: Detecting and Editing Privacy Neurons in Pretrained Language Models (EMNLP 2023)
    Xinwei Wu, Junzhuo Li, Minghui Xu, Weilong Dong, Shuangzhi Wu, Chao Bian, Deyi Xiong. [paper]

  17. Untying the Reversal Curse via Bidirectional Language Model Editing
    Jun-Yu Ma, Jia-Chen Gu, Zhen-Hua Ling, Quan Liu, Cong Liu. [paper]

More Related Papers

  1. FRUIT: Faithfully Reflecting Updated Information in Text. (NAACL 2022)
    Robert L. Logan IV, Alexandre Passos, Sameer Singh, Ming-Wei Chang. [paper] [code]

  2. Entailer: Answering Questions with Faithful and Truthful Chains of Reasoning. (EMNLP 2022)
    Oyvind Tafjord, Bhavana Dalvi Mishra, Peter Clark. [paper] [code] [video]

  3. Towards Tracing Factual Knowledge in Language Models Back to the Training Data.
    Ekin Akyürek, Tolga Bolukbasi, Frederick Liu, Binbin Xiong, Ian Tenney, Jacob Andreas, Kelvin Guu. (EMNLP 2022) [paper]

  4. Prompting GPT-3 To Be Reliable.
    Chenglei Si, Zhe Gan, Zhengyuan Yang, Shuohang Wang, Jianfeng Wang, Jordan Boyd-Graber, Lijuan Wang. [paper]

  5. Patching open-vocabulary models by interpolating weights. (NeurIPS 2022)
    Gabriel Ilharco, Mitchell Wortsman, Samir Yitzhak Gadre, Shuran Song, Hannaneh Hajishirzi, Simon Kornblith, Ali Farhadi, Ludwig Schmidt. [paper] [code]

  6. Decouple knowledge from paramters for plug-and-play language modeling (ACL2023 Findings)
    Xin Cheng, Yankai Lin, Xiuying Chen, Dongyan Zhao, Rui Yan.[paper] [code]

  7. Backpack Language Models
    John Hewitt, John Thickstun, Christopher D. Manning, Percy Liang. [paper]

  8. Learning to Model Editing Processes. (EMNLP 2022)
    Machel Reid, Graham Neubig. [paper]

  9. Trends in Integration of Knowledge and Large Language Models: A Survey and Taxonomy of Methods, Benchmarks, and Applications.
    Zhangyin Feng, Weitao Ma, Weijiang Yu, Lei Huang, Haotian Wang, Qianglong Chen, Weihua Peng, Xiaocheng Feng, Bing Qin, Ting liu. [paper]

Analysis

  1. Does Localization Inform Editing? Surprising Differences in Causality-Based Localization vs. Knowledge Editing in Language Models.
    Peter Hase, Mohit Bansal, Been Kim, Asma Ghandeharioun. [paper] [code]
  2. Dissecting Recall of Factual Associations in Auto-Regressive Language Models
    Mor Geva, Jasmijn Bastings, Katja Filippova, Amir Globerson. [paper]
  3. Evaluating the Ripple Effects of Knowledge Editing in Language Models
    Roi Cohen, Eden Biran, Ori Yoran, Amir Globerson, Mor Geva. [paper]
  4. Edit at your own risk: evaluating the robustness of edited models to distribution shifts.
    Davis Brown, Charles Godfrey, Cody Nizinski, Jonathan Tu, Henry Kvinge. [paper]
  5. Journey to the Center of the Knowledge Neurons: Discoveries of Language-Independent Knowledge Neurons and Degenerate Knowledge Neurons.
    Yuheng Chen, Pengfei Cao, Yubo Chen, Kang Liu, Jun Zhao. [paper]
  6. Linearity of Relation Decoding in Transformer Language Models
    Evan Hernandez, Martin Wattenberg, Arnab Sen Sharma, Jacob Andreas, Tal Haklay, Yonatan Belinkov, Kevin Meng, David Bau. [paper]
  7. KLoB: a Benchmark for Assessing Knowledge Locating Methods in Language Models
    Yiming Ju, Zheng Zhang. [paper]
  8. Inference-Time Intervention: Eliciting Truthful Answers from a Language Model (NeurIPS 2023)
    Kenneth Li, Oam Patel, Fernanda Viégas, Hanspeter Pfister, Martin Wattenberg. [paper] [code]
  9. Emptying the Ocean with a Spoon: Should We Edit Models? (EMNLP 2023 Findings)
    Yuval Pinter and Michael Elhadad. [paper]
  10. Unveiling the Pitfalls of Knowledge Editing for Large Language Models
    Zhoubo Li, Ningyu Zhang, Yunzhi Yao, Mengru Wang, Xi Chen and Huajun Chen. [paper]
  11. Editing Personality for LLMs
    Shengyu Mao, Ningyu Zhang, Xiaohan Wang, Mengru Wang, Yunzhi Yao, Yong Jiang, Pengjun Xie, Fei Huang and Huajun Chen. [paper]

🧰 Resources

Benchmarks and Tasks

Edit Type Benchmarks & Datasets
Fact Knowledge ZSRE, ZSRE plus, CounterFact,CounterFact plus, CounterFact+,ECBD, MQUAKE
Multi-Lingual Bi-ZsRE,Eva-KELLM
Sentiment Convsent
Bias Bias in Bios
Hallucination WikiBio
Commonsense MEMITcsk
Reasoning Eva-KELLM
Privacy Infomation Protect PrivQA, Knowledge Sanitation,Enron
Toxic Information RealToxicityPrompts
MultiModal MMEdit

Tools

EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language Models.

FastEdit: Editing large language models within 10 seconds

Citation

Please cite our paper if find our work useful.

@article{DBLP:journals/corr/abs-2305-13172,
  author       = {Yunzhi Yao and
                  Peng Wang and
                  Bozhong Tian and
                  Siyuan Cheng and
                  Zhoubo Li and
                  Shumin Deng and
                  Huajun Chen and
                  Ningyu Zhang},
  title        = {Editing Large Language Models: Problems, Methods, and Opportunities},
  journal      = {CoRR},
  volume       = {abs/2305.13172},
  year         = {2023},
  url          = {https://doi.org/10.48550/arXiv.2305.13172},
  doi          = {10.48550/arXiv.2305.13172},
  eprinttype    = {arXiv},
  eprint       = {2305.13172},
  timestamp    = {Tue, 30 May 2023 17:04:46 +0200},
  biburl       = {https://dblp.org/rec/journals/corr/abs-2305-13172.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}

🎉Contribution

Contributors

Contributing to this paper list

  • There are cases where we miss important works in this field, please contribute to this repo! Thanks for the efforts in advance.

Acknowledgement

  • We would like to express our gratitude to Longhui Yu for the kind reminder about the missing papers.