Awesome-Prompt-Adapter-Learning-for-VLMs

A curated list of prompt/adapter learning methods for vision-language models.

Papers

Keywords

Use text-based learnable prompts/adapters.

Use image-based learnable prompts/adapters.

Use text- and image-based learnable prompts/adapters.

Surveys

A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models. [Paper]
Parameter-Efficient Fine-Tuning for Pre-Trained Vision Models: A Survey. [Paper]

General Prompt Learning (Classification)

Experimental Comparison

Base-to-Novel Generalization. (ViT-B/16 CLIP)

Methods	Pub	Base	Novel	HM (main)	Code
CLIP	ICML 21	69.34	74.22	71.70	Link
CoOp	IJCV 22	82.69	63.22	71.66	Link
CoCoOp	CVPR 22	80.47	71.69	75.83	Link
ProDA	CVPR 22	81.56	72.30	76.65	Link
RPO	ICCV 23	81.13	75.00	77.78	Link
MaPLe	CVPR 23	82.28	75.14	78.55	Link
MetaPrompt	TIP 24	83.65	75.48	79.09	---
DePT	CVPR 24	83.62	75.04	79.10	Link
LASP	CVPR 23	83.18	76.11	79.48	---
TCP	CVPR 24	84.13	75.36	79.51	Link
MMA	CVPR 24	83.20	76.80	79.87	Link
PromptSRC	ICCV 23	84.26	76.10	79.97	Link
HPT	AAAI 24	84.32	76.86	80.23	Link
CoPrompt	ICLR 24	84.00	77.23	80.48	Link
PromptKD	CVPR 24	86.96	80.73	83.73	Link

Table 1. Average results on 11 datasets.

Paper List

2022

CoOp Learning to Prompt for Vision-Language Models. IJCV 2022.
[Paper] [Code]
CoCoOp Conditional Prompt Learning for Vision-Language Models. CVPR 2022.
[Paper] [Code]
ProDA Prompt Distribution Learning. CVPR 2022.
[Paper] [Code]
VPT Visual Prompt Tuning. ECCV 2022.
[Paper] [Code]

2023

MaPLe MaPLe: Multi-modal Prompt Learning. CVPR 2023.
[Paper] [Code]
KgCoOp Visual-Language Prompt Tuningx with Knowledge-guided Context Optimization. CVPR 2023.
[Paper] [Code]
LASP LASP: Text-to-Text Optimization for Language-Aware Soft Prompting of Vision & Language Models CVPR 2023.
[Paper]
DAM-VP Diversity-Aware Meta Visual Prompting CVPR 2023.
[Paper] [Code]
TaskRes Task Residual for Tuning Vision-Language Models CVPR 2023.
[Paper] [Code]
RPO Read-only Prompt Optimization for Vision-Language Few-shot Learning. ICCV 2023.
[Paper] [Code]
KAPT Knowledge-Aware Prompt Tuning for Generalizable Vision-Language Models. ICCV 2023.
[Paper]
ProGrad Prompt-aligned Gradient for Prompt Tuning. ICCV 2023.
[Paper][Code]
PromptSRC Self-regulating Prompts: Foundational Model Adaptation without Forgetting. ICCV 2023.
[Paper] [Code]
DeFo Learning to Decompose Visual Features with Latent Textual Prompts. ICLR 2023.
[Paper]
POMP Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition NeurIPS 2023.
[Paper] [Code]

2024

MetaPrompt Learning Domain Invariant Prompt for Vision-Language Models. TIP 2024.
[Paper]
SA2VP SA2VP: Spatially Aligned-and-Adapted Visual Prompt. AAAI 2024.
[Paper] [Code]
HPT Learning Hierarchical Prompt with Structured Linguistic Knowledge for Vision-Language Models. AAAI 2024.
[Paper] [Code]
LaViP LaViP: Language-Grounded Visual Prompts. AAAI 2024.
[Paper]
CoPrompt Consistency-guided Prompt Learning for Vision-Language Models. ICLR 2024.
[Paper] [Code]
ProText Learning to Prompt with Text Only Supervision for Vision-Language Models. arxiv 24.
[Paper] [Code]
PromptKD Unsupervised Prompt Distillation for Vision Language Models. CVPR 2024.
[Paper] [Code]
DePT DePT: Decoupled Prompt Tuning. CVPR 2024.
[Paper] [Code]
ArGue ArGue: Attribute-Guided Prompt Tuning for Vision-Language Models. CVPR 2024.
[Paper]
TCP TCP:Textual-based Class-aware Prompt tuning for Visual-Language Model. CVPR 2024.
[Paper] [Code]
MMA MMA: Multi-Modal Adapter for Vision-Language Models. CVPR 2024.
[Paper] [Code]

General Test-time Prompt Tuning (Image Classification)

Experimental Comparison

Methods	Pub	ImageNet	-A	-V2	-R	-S	Avg. (main)	Code
CoOp	IJCV 22	71.51	49.71	64.20	75.21	47.99	59.28	Link
CoCoOp	CVPR 22	71.02	50.63	64.07	76.18	48.75	59.91	Link
TPT	NeurIPS 22	68.98	54.77	63.45	77.06	47.94	60.81	Link
TPT+CoOp	NeurIPS 22	73.61	57.95	66.83	77.27	49.29	62.84	Link
PromptAlign	NeurIPS 23	---	59.37	65.29	79.33	59.37	63.55	Link
TPS+CoOp	Arxiv 24	73.73	60.49	66.84	77.44	49.08	65.52	Link
RLCF	ICLR 24	73.23	65.45	69.77	83.35	54.74	68.33	Link
RLCF+CoOp	ICLR 24	76.05	69.74	70.62	84.51	56.49	70.34	Link

Table 2. Test-time prompt tuning methods on OOD data.

Paper List

TPT Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models. NeurIPS 2022.
[Paper] [Code]
SwapPrompt SwapPrompt: Test-Time Prompt Adaptation for Vision-Language Models. NeurIPS 2023.
[Paper]
PrompAlign Align Your Prompts: Test-Time Prompting with Distribution Alignment for Zero-Shot Generalization. NeurIPS 2023.
[Paper] [Code]
TPS Just Shift It: Test-Time Prototype Shifting for Zero-Shot Generalization with Vision-Language Models. Arxiv 2024.
[Paper] [Code]
RLCF Test-time Adaptation with CLIP reward for zero-shot generalization in Vision-Language Models. ICLR 2024.
[Paper] [Code]
InTTA Invariant Test-Time Adaptation for Vision-Language Model Generalization. Arxiv 2024.
[Paper] [Code]

General Adapter Learning (Classification)

Paper List

CLIP-Adapter CLIP-Adapter: Better Vision-Language Models with Feature Adapters. Arxiv 2021.
[Paper] [Code]

Video Understanding

Prompt Learning

Efficient-Prompt Prompting visual-language models for efficient video understanding. ECCV 2022.
[Paper] [Code]
InTTA Expanding Language-Image Pretrained Models for General Video Recognition. ECCV 2022.
[Paper] [Code]
RePro Compositional Prompt Tuning with Motion Cues for Open-vocabulary Video Relation Detection. ICLR 2023.
[Paper] [Code]

Continual Learning

Prompt Learning

L2P Learning to Prompt for Continual Learning. CVPR 2022.
[Paper] [Code]
DualPrompt DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning. ECCV 2022.
[Paper] [Code]
EvoPrompt Evolving Parameterized Prompt Memory for Continual Learning. AAAI 2024.
[Paper]
CPP Steering Prototypes with Prompt-tuning for Rehearsal-free Continual Learning. WACV 2024.
[Paper] [Code]
CPrompt Consistent Prompting for Rehearsal-Free Continual Learning. CVPR 2024.
[Paper] [Code]
DIKI Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models. ECCV 2024.
[Paper] [Code]

Adapter Learning

MoE-Adapters4CL Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters. CVPR 2024.
[Paper] [Code]
Semantically-Shifted Incremental Adapter-Tuning is A Continual ViTransformer. CVPR 2024.
[Paper]
RAIL Advancing Cross-domain Discriminability in Continual Learning of Vison-Language Models. Arxiv 2024.
[Paper]
SEMA Self-Expansion of Pre-trained Models with Mixture of Adapters for Continual Learning. Arxiv 2024.
[Paper]

JiazuoYu/Awesome-Prompt-Adapter-Learning-for-Vision-Language-Models

Awesome-Prompt-Adapter-Learning-for-VLMs

Table of Contents

Keywords

Surveys

General Prompt Learning (Classification)

Experimental Comparison

Paper List

2022

2023

2024

General Test-time Prompt Tuning (Image Classification)

Experimental Comparison

Paper List

General Adapter Learning (Classification)

Paper List

Video Understanding

Prompt Learning

Continual Learning

Prompt Learning

Adapter Learning