/privacy-preserving-prompt

Privacy-Preserving Prompt Tuning for Large Language Model

Primary LanguagePython

Privacy-Preserving Prompt Tuning for Large Language Model

Symbol 🌟 ⬜️ ⬛️
Description Inspiration White-box method Black-box method

Attacker Methodology by Stages

Prompt Injection Attacks (PIA)

Paper Year Source Attack Prompt Type Tasks
⬛️Effective Prompt Extraction from Language Models 2024.02 Static Badge Static Badge Instruction Prompt Information Extration
⬛️Prompt Stealing Attacks Against Large Language Models 2024.02 Static Badge Role-Based Prompt, Direct Prompt, In-Context Prompt Q&A
🌟⬛️TrojLLM: A Black-box Trojan Prompt Attack on Large Language Models (NeurIPS, 2023) Static Badge Static Badge Instruction prompt Classification
🌟⬛️Ignore Previous Prompt : Attack Techniques For Language Models (NeurIPS, 2022) Static Badge Static Badge Instruction prompt, In-context learning
🌟⬜️BadPrompt: Backdoor Attacks on Continuous Prompts (NeurIPS, 2022) Static Badge Static Badge Instruction prompt
⬜️PromptAttack: Prompt-based Attack for Language Models via Gradient Search (NLPCC, 2022) Static Badge Instruction prompt

Membership Inference Attacks (MIA)

Paper Year Source
🌟⬛️Do Membership Inference AttacksWork on Large Language Models? 2024.02 Static Badge Static Badge
⬜️Language Model Inversion 2023.11 Static Badge
⬛️Assessing Privacy Risks in Language Models: A Case Study on Summarization Tasks 2023.10 Static Badge
⬛️Beyond Memorization: Violating Privacy Via Inference with Large Language Models 2023.10 Static Badge Static Badge
⬜️Extracting Training Data from Large Language Models (USENIX Security, 2021) Static Badge

Protector Methodology

Differential Privacy (DP)

Paper Year Source Tasks Defense
🌟⬛️Privacy-Preserving In-Context Learning with Differentially Private Few-Shot Generation (ICLR, 2024) Static Badge Static Badge Classification, Information Extraction
🌟⬛️DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer (ICLR, 2024) Static Badge Static Badge Sentiment Classification
🌟⬛️Privacy-Preserving In-Context Learning For Large Language Models (ICLR, 2024) Static Badge Classification, Document Q&A, Dialog Summarization
⬛️On the Privacy Risk of In-context Learning (TrustNLP, 2024)) Static Badge Classification, Generation MIA
⬛️A Customized Text Sanitization Mechanism with Differential Privacy (ACL, 2023) Static Badge Static Badge Classification, Generation
🌟⬛️⬜️Flocks of Stochastic Parrots: Differentially Private Prompt Learning for Large Language Models (NeurIPS, 2023) Static Badge Classification
🌟⬛️Locally Differentially Private Document Generation Using Zero Shot Prompting (EMNLP, 2023) Static Badge Static Badge Text Classification
⬜️DP-forward: Fine-tuning and inference on language models with differential privacy in forward pass (SIGSAC, 2023) Static Badge Classification
⬛️InferDPT: Privacy-preserving Inference for Black-box Large Language Models 2023.12 Static Badge Static Badge Classification, Generation
⬜️Privacy-Preserving Prompt Tuning for Large Language Model Services 2023.05 Static Badge Sentiment Classification, Document Q&A
⬛️Differential Privacy for Text Analytics via Natural Text Sanitization (ACL-IJCNLP, 2021) Static Badge Static Badge Classification

Secure Multi-Party Computing (SMPC)

Paper Year Source Tasks Defence
⬜️Ciphergpt: Secure two-party gpt inference (Crypto, 2024) Static Badge Classification
⬜️SecFormer: Towards Fast and Accurate Privacy-Preserving Inference for Large Language Models 2024.01 Static Badge Classification, Semantic Similarity, Linguistic Acceptability Model Inside
⬜️LLMs Can Understand Encrypted Prompt: Towards Privacy-Computing Friendly Transformers 2023.05 Static Badge Classification Model Inside

Anonymization Techniques

Paper Year Source Tasks Keywords
⬛️SEmojiCrypt: Prompt Encryption for Secure Communication with Large Language Models 2024.02 Static Badge Static Badge Classification Emoji
⬛️ProPILE: Probing Privacy Leakage in Large Language Models (NeurIPS, 2023) Static Badge PII
⬛️Recovering from Privacy-Preserving Masking with Large Language Models 2023.12 Static Badge [MASK]
⬛️Hide and Seek (HaS): A Lightweight Framework for Prompt Privacy Protection 2023.09 Static Badge Static Badge PII

Other Methods

Paper Year Source Tasks Method Keyword
⬜️Privatelora For Efficient Privacy Preserving LLM (CoRR, 2023) Static Badge Static Badge LoRA
⬜️TextObfuscator: Making Pre-trained Language Model a Privacy Protector via Obfuscating Word Representations (ACL, 2023) Static Badge Static Badge Classification

Related Survey

Paper Year Source
On Protecting the Data Privacy of Large Language Models (LLMs): A Survey 2024.03 Static Badge
Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory (ICLR, 2024) Static Badge
A Survey on Large Language Model (LLM) Security and Privacy: The Good, the Bad, and the Ugly 2023.12 Static Badge
Privacy in Large Language Models: Attacks, Defenses and Future Directions 2023.10 Static Badge
Not what you’ve signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection 2023.05 Static Badge Static Badge
Privacy-Preserving Large Language Models (PPLLMs) 2023.01 Static Badge Static Badge

Fine-tuning

Paper Year Source
SentinelLMs: Encrypted Input Adaptation and Fine-tuning of Language Models for Private and Secure Inference (AAAI, 2024) Static Badge Static Badge