DeltaPapers

Must-read papers on parameter-efficient tuning methods (Delta Tuning) for pre-trained models.

Content

Why Parameter Efficient?
Keywords Convention
Papers
Contribution
- Contributors
- Contributing to this paper list

Why Parameter Efficient?

Increasingly larger pre-trained models bring both the blessing of effectiveness on existing and unknown tasks, and the curse of prohibitive adaptation cost. In this context, parameter-efficient tuning methods (delta tuning) are developed and demonstrate a promising way to stimulate colossal models with only a small portion of tunable parameters, thereby dramatically reducing the computational and storage costs of model adaptation. In addition to the obvious practical value, delta tuning seems to imply that the particular adaptation of pre-trained models may be a very simple process, which may usher in intriguing theoretical issues that are worth exploring.

Keywords Convention

We follow the general idea of PromptPapers to label the papers.

The abbreviation of the work.

The main explored task of the work.

The main explored property of delta tuning methods in the work.

Other important information of the work.

Papers

Overview

Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Model, Preprint 2022.

Ning Ding, Yujia Qin, Guang Yang, Fuchao Wei, Zonghan Yang, Yusheng Su, Shengding Hu, Yulin Chen, Chi-Min Chan, Weize Chen, Jing Yi, Weilin Zhao, Xiaozhi Wang, Zhiyuan Liu, Hai-Tao Zheng, Jianfei Chen, Yang Liu, Jie Tang, Juanzi Li, Maosong Sun. [pdf], [OpenDelta]

Methodology

Parameter-Efficient Transfer Learning for NLP, ICML 2019.

Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin de Laroussilhe, Andrea Gesmundo, Mona Attariyan, Sylvain Gelly. [pdf], [Project]
BERT and PALs: Projected Attention Layers for Efficient Adaptation in Multi-Task, ICML 2019.

Asa Cooper Stickland, Iain Murray. [pdf], [Project]
Masking as an Efficient Alternative to Finetuning for Pretrained Language Models, EMNLP 2020.

Mengjie Zhao, Tao Lin, Fei Mi, Martin Jaggi, Hinrich Schütze. [pdf], [Project]
Prefix-Tuning: Optimizing Continuous Prompts for Generation, ACL 2021.

Xiang Lisa Li, Percy Liang. [pdf], [Project]
Parameter-Efficient Transfer Learning with Diff Pruning, ACL 2021.

Demi Guo, Alexander M. Rush, Yoon Kim. [pdf], [Project]
The Power of Scale for Parameter-Efficient Prompt Tuning, EMNLP 2021.

Brian Lester, Rami Al-Rfou, Noah Constant. [pdf], [OpenPrompt Implementation]
COMPACTER: Efficient Low-Rank Hypercomplex Adapter Layers, Neurips 2021.

Rabeeh Karimi Mahabadi, James Henderson, Sebastian Ruder. [pdf], [Project]
BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models, ACL 2022.

Elad Ben Zaken, Shauli Ravfogel, Yoav Goldberg. [pdf], [Project]
LoRA: Low-Rank Adaptation of Large Language Models, ICLR 2022.

Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen. [pdf], [Project]
Fast Model Editing at Scale, ICLR 2022.

Eric Mitchell, Charles Lin, Antoine Bosselut, Chelsea Finn, Christopher D. Manning. [pdf], [Project]
Editing Factual Knowledge in Language Models, EMNLP 2021.

Nicola De Cao, Wilker Aziz, Ivan Titov. [pdf], [Project]
Training Neural Networks with Fixed Sparse Masks, Neurips 2021.

Yi-Lin Sung, Varun Nair, Colin Raffeln. [pdf], [Project]
GPT Understands, Too, Preprint 2020.

Xiao Liu, Yanan Zheng, Zhengxiao Du, Ming Ding, Yujie Qian, Zhilin Yang, Jie Tang. [pdf], [Project]
LiST: Lite Self-training Makes Efficient Few-shot Learners, Preprint 2021.

Yaqing Wang, Subhabrata Mukherjee, Xiaodong Liu, Jing Gao, Ahmed Hassan Awadallah, Jianfeng Gao. [pdf], [Project]
Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks, ACL 2021.

Rabeeh Karimi Mahabadi, Sebastian Ruder, Mostafa Dehghani, James Henderson. [pdf], [Project]

Analysis

Towards a Unified View of Parameter-Efficient Transfer Learning, ICLR 2022.

Junxian He, Chunting Zhou, Xuezhe Ma, Taylor Berg-Kirkpatrick, Graham Neubig. [pdf], [Project]
AdapterBias: Parameter-efficient Token-dependent Embedding Shift for Adapters in NLP Tasks, Preprint 2021.

Anonymous. [pdf]
AdapterFusion: Non-Destructive Task Composition for Transfer Learning, Preprint 2020.

Jonas Pfeiffer, Aishwarya Kamath, Andreas Rücklé, Kyunghyun Cho, Iryna Gurevych. [pdf], [Project]
On the Effectiveness Adapter-based Tuning for Pretrained Language Model Adaptation, ACL 2021.

Ruidan He, Linlin Liu, Hai Ye, Qingyu Tan, Bosheng Ding, Liying Cheng, Jia-Wei Low, Lidong Bing, Luo Si. [pdf]
UNIPELT: A Unified Framework for Parameter-Efficient Language Model Tuning, Preprint 2021.

Yuning Mao, Lambert Mathias, Rui Hou, Amjad Almahairi, Hao Ma, Jiawei Han, Wen-tau Yih, Madian Khabsa. [pdf], [Project]
Conditionally Adaptive Multi-Task Learning: Improving Transfer Learning in NLP Using Fewer Parameters & Less Data, ICLR 2021.

Jonathan Pilault, Amine Elhattami, Christopher Pal. [pdf], [Project]
Raise a Child in Large Language Model: Towards Effective and Generalizable Fine-tuning, EMNLP 2021.

Runxin Xu, Fuli Luo, Zhiyuan Zhang, Chuanqi Tan, Baobao Chang, Songfang Huang, Fei Huang. [pdf], [Project]
Exploring Low-dimensional Intrinsic Task Subspace via Prompt Tuning, Preprint 2021.

Yujia Qin, Xiaozhi Wang, Yusheng Su, Yankai Lin, Ning Ding, Zhiyuan Liu, Juanzi Li, Lei Hou, Peng Li, Maosong Sun, Jie Zhou. [pdf], [Project]
Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning, ACL 2021.

Armen Aghajanyan, Luke Zettlemoyer, Sonal Gupta. [pdf]
Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product Operators, ACL 2021.

Peiyu Liu, Ze-Feng Gao, Wayne Xin Zhao, Z.Y. Xie, Zhong-Yi Lu, Ji-Rong Wen. [pdf], [Project]
Movement Pruning: Adaptive Sparsity by Fine-Tuning, Neurips 2020.

Victor Sanh, Thomas Wolf, Alexander M. Rush. [pdf], [Project]
Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with 1/n Parameters, ICLR 2021.

Aston Zhang, Yi Tay, Shuai Zhang, Alvin Chan, Anh Tuan Luu, Siu Cheung Hui, Jie Fu. [pdf]
Shapeshifter: a Parameter-efficient Transformer using Factorized Reshaped Matrices, Neurips 2021.

Aliakbar Panahi, Seyran Saeedi, Tom Arodz. [pdf], [Project]
Adapterdrop: On the efficiency of adapters in transformers, EMNLP 2021.

Andreas Rücklé, Gregor Geigle, Max Glockner, Tilman Beck, Jonas Pfeiffer, Nils Reimers, Iryna Gurevych. [pdf]

Applications & Tools

OpenDelta: A Flexible Plug-in Tool for Delta Tuning. [Project]
Lightweight Adapter Tuning for Multilingual Speech Translation, ACL 2021.

Hang Le, Juan Pino, Changhan Wang, Jiatao Gu, Didier Schwab, Laurent Besacier. [pdf], [Project]
VL-ADAPTER: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks, CVPR 2022.

Yi-Lin Sung, Jaemin Cho, Mohit Bansal. [pdf], [Project]
AdapterHub: A Framework for Adapting Transformers, EMNLP demo 2020.

Jonas Pfeiffer, Andreas Rücklé, Clifton Poth, Aishwarya Kamath, Ivan Vulić, Sebastian Ruder, Kyunghyun Cho, Iryna Gurevych. [pdf], [Project]

Blogs

Parameter-efficient transfer learning for NLP, Overview Blog Post 2022, [Blog]

Contribution

Contributors

Contributing to this paper list

First, think about which category the work should belong to.
Second, use the same format as the others to describe the work. Note that there should be an empty line between the title and the author's list, and take care of the indentation.
Then, add keywords tags. Add the pdf link of the paper. If it is an arxiv publication, we prefer /abs/ format to /pdf/ format.

thunlp/DeltaPapers