/KnowledgeEditingPapers

Must-read Papers on Knowledge Editing for Large Language Models.

MIT LicenseMIT

Knowledge Editing for LLMs Papers

Awesome License: MIT

Must-read papers on knowledge editing for large language models.

🔔 News


🔍 Contents


🌟 Why Knowledge Editing?

Knowledge Editing is a compelling field of research that focuses on facilitating efficient modifications to the behavior of models, particularly foundation models. The aim is to implement these changes within a specified scope of interest without negatively affecting the model's performance across a broader range of inputs.

Keywords

Knowledge Editing has strong connections with following topics.

  • Updating and fixing bugs for large language models
  • Language models as knowledge base, locating knowledge in large language models
  • Lifelong learning, unlearning and etc.
  • Security and privacy for large language models

📜 Papers

This is a collection of research and review papers of Knowledge Editing. Any suggestions and pull requests are welcome for better sharing of latest research progress.

Overview

A Comprehensive Study of Knowledge Editing for Large Language Models
Ningyu Zhang, Yunzhi Yao, Bozhong Tian, Peng Wang, Shumin Deng, Mengru Wang, Zekun Xi, Shengyu Mao, Jintian Zhang, Yuansheng Ni, Siyuan Cheng, Ziwen Xu, Xin Xu, Jia-Chen Gu, Yong Jiang, Pengjun Xie, Fei Huang, Lei Liang, Zhiqiang Zhang, Xiaowei Zhu, Jun Zhou, Huajun Chen. [paper][benchmark][code]

Editing Large Language Models: Problems, Methods, and Opportunities, EMNLP 2023 Main Conference Paper
Yunzhi Yao, Peng Wang, Bozhong Tian, Siyuan Cheng, Zhoubo Li, Shumin Deng, Huajun Chen, Ningyu Zhang. [paper][code]

Editing Large Language Models, AACL 2023 Tutorial
Ningyu Zhang, Yunzhi Yao, Shumin Deng. [Github] [Google Drive] [Baidu Pan]

Knowledge Editing for Large Language Models: A Survey
Song Wang, Yaochen Zhu, Haochen Liu, Zaiyi Zheng, Chen Chen, Jundong Li. [paper]

A Survey on Knowledge Editing of Neural Networks
Vittorio Mazzia, Alessandro Pedrani, Andrea Caciolai, Kay Rottmann, Davide Bernardi. [paper]

Knowledge Unlearning for LLMs: Tasks, Methods, and Challenges
Nianwen Si, Hao Zhang, Heyu Chang, Wenlin Zhang, Dan Qu, Weiqiang Zhang. [paper]

Methods

Preserve Parameters

Memory-based
  1. Memory-Based Model Editing at Scale (ICML 2022)
    Eric Mitchell, Charles Lin, Antoine Bosselut, Christopher D. Manning, Chelsea Finn. [paper] [code] [demo]

  2. Fixing Model Bugs with Natural Language Patches. (EMNLP 2022)
    Shikhar Murty, Christopher D. Manning, Scott M. Lundberg, Marco Túlio Ribeiro. [paper] [code]

  3. MemPrompt: Memory-assisted Prompt Editing with User Feedback. (EMNLP 2022)
    Aman Madaan, Niket Tandon, Peter Clark, Yiming Yang. [paper] [code] [page] [video]

  4. Large Language Models with Controllable Working Memory.
    Daliang Li, Ankit Singh Rawat, Manzil Zaheer, Xin Wang, Michal Lukasik, Andreas Veit, Felix Yu, Sanjiv Kumar. [paper]

  5. Can We Edit Factual Knowledge by In-Context Learning?
    Ce Zheng, Lei Li, Qingxiu Dong, Yuxuan Fan, Zhiyong Wu, Jingjing Xu, Baobao Chang. [paper]

  6. Can LMs Learn New Entities from Descriptions? Challenges in Propagating Injected Knowledge
    Yasumasa Onoe, Michael J.Q. Zhang, Shankar Padmanabhan, Greg Durrett, Eunsol Choi. [paper]

  7. MQUAKE: Assessing Knowledge Editing inLanguage Models via Multi-Hop Questions
    Zexuan Zhong, Zhengxuan Wu, Christopher D. Manning, Christopher Potts, Danqi Chen.
    .[paper]

  8. Retrieval-augmented Multilingual Knowledge Editing
    Weixuan Wang, Barry Haddow, Alexandra Birch. [paper] [code]

Additional Parameters
  1. Calibrating Factual Knowledge in Pretrained Language Models. (EMNLP 2022)
    Qingxiu Dong, Damai Dai, Yifan Song, Jingjing Xu, Zhifang Sui, Lei Li. [paper] [code]

  2. Transformer-Patcher: One Mistake worth One Neuron. (ICLR 2023)
    Zeyu Huang, Yikang Shen, Xiaofeng Zhang, Jie Zhou, Wenge Rong, Zhang Xiong. [paper] [code]

  3. Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors.
    Thomas Hartvigsen, Swami Sankaranarayanan, Hamid Palangi, Yoon Kim, Marzyeh Ghassemi. [paper] [code]

  4. Neural Knowledge Bank for Pretrained Transformers
    Damai Dai, Wenbin Jiang, Qingxiu Dong, Yajuan Lyu, Qiaoqiao She, Zhifang Sui. [paper]

  5. Rank-One Editing of Encoder-Decoder Models
    Vikas Raunak, Arul Menezes. [paper]

  6. MELO: Enhancing Model Editing with Neuron-Indexed Dynamic LoRA. (AAAI 2024)
    Lang Yu, Qin Chen, Jie Zhou, Liang He. [paper] [code]

Change LM's representation space
  1. Inspecting and Editing Knowledge Representations in Language Models
    Evan Hernandez, Belinda Z. Li, Jacob Andreas. [paper] [code]

Modify Parameters

Finetuning
  1. Plug-and-Play Adaptation for Continuously-updated QA. (ACL 2022 Findings)
    Kyungjae Lee, Wookje Han, Seung-won Hwang, Hwaran Lee, Joonsuk Park, Sang-Woo Lee. [paper] [code]

  2. Modifying Memories in Transformer Models.
    Chen Zhu, Ankit Singh Rawat, Manzil Zaheer, Srinadh Bhojanapalli, Daliang Li, Felix Yu, Sanjiv Kumar. [paper]

  3. Forgetting before Learning: Utilizing Parametric Arithmetic for Knowledge Updating in Large Language Models
    Shiwen Ni, Dingwei Chen, Chengming Li, Xiping Hu, Ruifeng Xu and Min Yang. [paper]

Meta-learning
  1. Editing Factual Knowledge in Language Models.
    Nicola De Cao, Wilker Aziz, Ivan Titov. (EMNLP 2021) [paper] [code]

  2. Fast Model Editing at Scale. (ICLR 2022)
    Eric Mitchell, Charles Lin, Antoine Bosselut, Chelsea Finn, Christopher D. Manning. [paper] [code] [page]

  3. Editable Neural Networks. (ICLR 2020)
    Anton Sinitsin, Vsevolod Plokhotnyuk, Dmitry V. Pyrkin, Sergei Popov, Artem Babenko. [paper] [code]

  4. Editing Language Model-based Knowledge Graph Embeddings? (AAAI 2024)
    Siyuan Cheng, Ningyu Zhang, Bozhong Tian, Xi Chen, Qingbing Liu, Huajun Chen. [paper] [code]

  5. MASSIVE EDITING FOR LARGE LANGUAGE MODELS VIA META LEARNING
    Chenmien Tan1, Ge Zhang, Jie Fu. [paper] [code]

Locate and edit
  1. Editing a classifier by rewriting its prediction rules. (NeurIPS 2021)
    Shibani Santurkar, Dimitris Tsipras, Mahalaxmi Elango, David Bau, Antonio Torralba, Aleksander Madry. [paper] [code]

  2. Language Anisotropic Cross-Lingual Model Editing.
    Yang Xu, Yutai Hou, Wanxiang Che. [paper]

  3. Repairing Neural Networks by Leaving the Right Past Behind.
    Ryutaro Tanno, Melanie F. Pradier, Aditya Nori, Yingzhen Li. [paper]

  4. Locating and Editing Factual Associations in GPT. (NeurIPS 2022)
    Kevin Meng, David Bau, Alex Andonian, Yonatan Belinkov. [paper] [code] [page] [video]

  5. Mass-Editing Memory in a Transformer.
    Kevin Meng, Arnab Sen Sharma, Alex Andonian, Yonatan Belinkov, David Bau. [paper] [code] [page] [demo]

  6. Editing models with task arithmetic .
    Gabriel Ilharco, Marco Tulio Ribeiro, Mitchell Wortsman, Ludwig Schmidt, Hannaneh Hajishirzi, Ali Farhadi. [paper]

  7. Editing Commonsense Knowledge in GPT .
    Anshita Gupta, Debanjan Mondal, Akshay Krishna Sheshadri, Wenlong Zhao, Xiang Lorraine Li, Sarah Wiegreffe, Niket Tandon. [paper]

  8. Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs.
    Peter Hase, Mona Diab, Asli Celikyilmaz, Xian Li, Zornitsa Kozareva, Veselin Stoyanov, Mohit Bansal, Srinivasan Iyer. [paper] [code]

  9. Detecting Edit Failures In Large Language Models: An Improved Specificity Benchmark .
    Jason Hoelscher-Obermaier, Julia Persson, Esben Kran, Ioannis Konstas, Fazl Barez. [paper]

  10. Knowledge Neurons in Pretrained Transformers.(ACL 2022)
    Damai Dai , Li Dong, Yaru Hao, Zhifang Sui, Baobao Chang, Furu Wei.[paper] [code] [code by EleutherAI]

  11. LEACE: Perfect linear concept erasure in closed form .
    Nora Belrose, David Schneider-Joseph, Shauli Ravfogel, Ryan Cotterell, Edward Raff, Stella Biderman. [paper]

  12. Transformer Feed-Forward Layers Are Key-Value Memories. (EMNLP 2021)
    Mor Geva, Roei Schuster, Jonathan Berant, Omer Levy. [paper]

  13. Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space.(EMNLP 2022)
    Mor Geva, Avi Caciularu, Kevin Ro Wang, Yoav Goldberg. [paper]

  14. PMET: Precise Model Editing in a Transformer.
    Xiaopeng Li, Shasha Li, Shezheng Song, Jing Yang, Jun Ma, Jie Yu. [paper] [code]

  15. Unlearning Bias in Language Models by Partitioning Gradients. (ACL 2023 Findings)
    Charles Yu, Sullam Jeoung, Anish Kasi, Pengfei Yu, Heng Ji. [paper] [code]

  16. DEPN: Detecting and Editing Privacy Neurons in Pretrained Language Models (EMNLP 2023)
    Xinwei Wu, Junzhuo Li, Minghui Xu, Weilong Dong, Shuangzhi Wu, Chao Bian, Deyi Xiong. [paper]

  17. Untying the Reversal Curse via Bidirectional Language Model Editing
    Jun-Yu Ma, Jia-Chen Gu, Zhen-Hua Ling, Quan Liu, Cong Liu. [paper]

  18. PokeMQA: Programmable knowledge editing for Multi-hop Question Answering
    Hengrui Gu, Kaixiong Zhou, Xiaotian Han, Ninghao Liu, Ruobing Wang, Xin Wang. [paper] [code]

More Related Papers

  1. FRUIT: Faithfully Reflecting Updated Information in Text. (NAACL 2022)
    Robert L. Logan IV, Alexandre Passos, Sameer Singh, Ming-Wei Chang. [paper] [code]

  2. Entailer: Answering Questions with Faithful and Truthful Chains of Reasoning. (EMNLP 2022)
    Oyvind Tafjord, Bhavana Dalvi Mishra, Peter Clark. [paper] [code] [video]

  3. Towards Tracing Factual Knowledge in Language Models Back to the Training Data.
    Ekin Akyürek, Tolga Bolukbasi, Frederick Liu, Binbin Xiong, Ian Tenney, Jacob Andreas, Kelvin Guu. (EMNLP 2022) [paper]

  4. Prompting GPT-3 To Be Reliable.
    Chenglei Si, Zhe Gan, Zhengyuan Yang, Shuohang Wang, Jianfeng Wang, Jordan Boyd-Graber, Lijuan Wang. [paper]

  5. Patching open-vocabulary models by interpolating weights. (NeurIPS 2022)
    Gabriel Ilharco, Mitchell Wortsman, Samir Yitzhak Gadre, Shuran Song, Hannaneh Hajishirzi, Simon Kornblith, Ali Farhadi, Ludwig Schmidt. [paper] [code]

  6. Decouple knowledge from paramters for plug-and-play language modeling (ACL2023 Findings)
    Xin Cheng, Yankai Lin, Xiuying Chen, Dongyan Zhao, Rui Yan.[paper] [code]

  7. Backpack Language Models
    John Hewitt, John Thickstun, Christopher D. Manning, Percy Liang. [paper]

  8. Learning to Model Editing Processes. (EMNLP 2022)
    Machel Reid, Graham Neubig. [paper]

  9. Trends in Integration of Knowledge and Large Language Models: A Survey and Taxonomy of Methods, Benchmarks, and Applications.
    Zhangyin Feng, Weitao Ma, Weijiang Yu, Lei Huang, Haotian Wang, Qianglong Chen, Weihua Peng, Xiaocheng Feng, Bing Qin, Ting liu. [paper]

  10. DUnE: Dataset for Unified Editing. (EMNLP 2023)
    Afra Feyza Akyürek, Eric Pan, Garry Kuwanto, Derry Wijaya. [paper]

Analysis

  1. Does Localization Inform Editing? Surprising Differences in Causality-Based Localization vs. Knowledge Editing in Language Models.
    Peter Hase, Mohit Bansal, Been Kim, Asma Ghandeharioun. [paper] [code]
  2. Dissecting Recall of Factual Associations in Auto-Regressive Language Models
    Mor Geva, Jasmijn Bastings, Katja Filippova, Amir Globerson. [paper]
  3. Evaluating the Ripple Effects of Knowledge Editing in Language Models
    Roi Cohen, Eden Biran, Ori Yoran, Amir Globerson, Mor Geva. [paper]
  4. Edit at your own risk: evaluating the robustness of edited models to distribution shifts.
    Davis Brown, Charles Godfrey, Cody Nizinski, Jonathan Tu, Henry Kvinge. [paper]
  5. Journey to the Center of the Knowledge Neurons: Discoveries of Language-Independent Knowledge Neurons and Degenerate Knowledge Neurons.
    Yuheng Chen, Pengfei Cao, Yubo Chen, Kang Liu, Jun Zhao. [paper]
  6. Linearity of Relation Decoding in Transformer Language Models
    Evan Hernandez, Martin Wattenberg, Arnab Sen Sharma, Jacob Andreas, Tal Haklay, Yonatan Belinkov, Kevin Meng, David Bau. [paper]
  7. KLoB: a Benchmark for Assessing Knowledge Locating Methods in Language Models
    Yiming Ju, Zheng Zhang. [paper]
  8. Inference-Time Intervention: Eliciting Truthful Answers from a Language Model (NeurIPS 2023)
    Kenneth Li, Oam Patel, Fernanda Viégas, Hanspeter Pfister, Martin Wattenberg. [paper] [code]
  9. Emptying the Ocean with a Spoon: Should We Edit Models? (EMNLP 2023 Findings)
    Yuval Pinter and Michael Elhadad. [paper]
  10. Unveiling the Pitfalls of Knowledge Editing for Large Language Models
    Zhoubo Li, Ningyu Zhang, Yunzhi Yao, Mengru Wang, Xi Chen and Huajun Chen. [paper]
  11. Editing Personality for LLMs
    Shengyu Mao, Ningyu Zhang, Xiaohan Wang, Mengru Wang, Yunzhi Yao, Yong Jiang, Pengjun Xie, Fei Huang and Huajun Chen. [paper]
  12. Evaluating Dependencies in Fact Editing for Language Models: Specificity and Implication Awareness(Findings of EMNLP2023)
    Zichao Li, Ines Arous, Siva Reddy, Jackie C.K. Cheung [paper]
  13. Finding and Editing Multi-Modal Neurons in Pre-Trained Transformer
    Haowen Pan,Yixin Cao,Xiaozhi Wang,Xun Yang. [paper]
  14. Assessing Knowledge Editing in Language Models via Relation Perspective
    Yifan Wei,Xiaoyan Yu,Huanhuan Ma,Fangyu Lei,Yixuan Weng,Ran Song,Kang Liu. [paper]
  15. History Matters: Temporal Knowledge Editing in Large Language Model(AAAI 2024)
    Xunjian Yin,Jin Jiang,Liming Yang,Xiaojun Wan. [paper]
  16. Cross-Lingual Knowledge Editing in Large Language Models
    Jiaan Wang, Yunlong Liang, Zengkui Sun, Yuxuan Cao, Jiarong Xu. [paper]
  17. Large Language Models Relearn Removed Concepts
    Michelle Lo, Shay B. Cohen, Fazl Barez [paper]

🧰 Resources

Benchmarks and Tasks

Edit Type Benchmarks & Datasets
Fact Knowledge ZSRE, ZSRE plus, CounterFact,CounterFact plus, CounterFact+,ECBD, MQUAKE,DepEdit
Multi-Lingual Bi-ZsRE,Eva-KELLM, MzsRE
Sentiment Convsent
Bias Bias in Bios
Hallucination WikiBio
Commonsense MEMITcsk
Reasoning Eva-KELLM
Privacy Infomation Protect PrivQA, Knowledge Sanitation,Enron
Unified Benchmark DUnE
Toxic Information RealToxicityPrompts,Toxicity Unlearning
MultiModal MMEdit

Tools

EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language Models.

FastEdit: Editing large language models within 10 seconds

Citation

Please cite our paper if find our work useful.

@article{DBLP:journals/corr/abs-2305-13172,
  author       = {Yunzhi Yao and
                  Peng Wang and
                  Bozhong Tian and
                  Siyuan Cheng and
                  Zhoubo Li and
                  Shumin Deng and
                  Huajun Chen and
                  Ningyu Zhang},
  title        = {Editing Large Language Models: Problems, Methods, and Opportunities},
  journal      = {CoRR},
  volume       = {abs/2305.13172},
  year         = {2023},
  url          = {https://doi.org/10.48550/arXiv.2305.13172},
  doi          = {10.48550/arXiv.2305.13172},
  eprinttype    = {arXiv},
  eprint       = {2305.13172},
  timestamp    = {Tue, 30 May 2023 17:04:46 +0200},
  biburl       = {https://dblp.org/rec/journals/corr/abs-2305-13172.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}

🎉Contribution

Contributors

Contributing to this paper list

  • There are cases where we miss important works in this field, please contribute to this repo! Thanks for the efforts in advance.

Acknowledgement

  • We would like to express our gratitude to Longhui Yu for the kind reminder about the missing papers.