about-NLP: A repository from zcq0314

1、对抗攻击和鲁棒性

2023

【ACL】

How do humans perceive adversarial text? A reality check on the validity and naturalness of word-based adversarial attacks[PDF]
Randomized Smoothing with Masked Inference for Adversarially Robust Text Classifications[PDF]
Text Adversarial Purification as Defense against Adversarial Attacks[PDF] （防御）
White-Box Multi-Objective Adversarial Attack on Dialogue Generation[PDF]
Contrastive Learning with Adversarial Examples for Alleviating Pathology of Language Model[PDF]

【AAAI】

SSPAttack: A Simple and Sweet Paradigm for Black-Box Hard-Label Textual Adversarial Attack （同义词替换）

2022

【ACL】

Adversarial Authorship Attribution for Deobfuscation[PDF][Code]
Adversarial Soft Prompt Tuning for Cross-Domain Sentiment Analysis[PDF]
Flooding-X: Improving BERT’s Resistance to Adversarial Attacks via LossRestricted Fine-Tuning[PDF]
Imputing Out-of-Vocabulary Embeddings with LOVE Makes Language Models Robust with Little Cost[PDF][Code]
ParaDetox: Detoxification with Parallel Data[PDF][Code]
Pass off Fish Eyes for Pearls: Attacking Model Selection of Pre-trained Models[PDF][Code]
SHIELD: Defending Textual Neural Networks against Multiple Black-Box[PDF][Code]
Towards Robustness of Text-to-SQL Models Against Natural and Realistic Adversarial Table Perturbation[PDF][Code]

【EMNLP】

Character-level White-Box Adversarial Attacks against Transformers via Attachable Subwords Substitution[PDF][Code]
（双答案句子攻击问答模型） TASA: Deceiving Question Answering Models by Twin Answer Sentences Attack[PDF][Code]
Textual Manifold-based Defense Against Natural Language Adversarial Examples[PDF][Code]
Why Should Adversarial Perturbations be Imperceptible? Rethink the Research Paradigm in Adversarial NLP[PDF]

【COLING】

Semantic-Preserving Adversarial Code Comprehension[PDF]
PARSE: An Efficient Search Method for Black-box Adversarial Text Attacks[PDF]
PAEG: Phrase-level Adversarial Example Generation for Neural Machine Translation[PDF]
(最小删除引起的神经机器翻译错误）Rare but Severe Neural Machine Translation Errors Induced by Minimal Deletion: An Empirical Study on Chinese and English[PDF]

【NAACL】

ValCAT: Variable-Length Contextualized Adversarial Transformations Using Encoder-Decoder Language Model[PDF]
SHARP: Search-Based Adversarial Attack for Structured Prediction
Phrase-level Textual Adversarial Attack with Label Preservation
Adversarial Text Normalization[PDF]

【AAAI】

Word Level Robustness Enhancement: Fight Perturbation with Perturbation

2021

【ACL】

Improving Gradient-based Adversarial Training for Text Classification by Contrastive Learning and Auto-Encoder[PDF]
Defense against Synonym Substitution-based Adversarial Attacks via Dirichlet Neighborhood Ensemble
A Sweet Rabbit Hole by DARCY: Using Honeypots to Detect Universal Trigger’s Adversarial Attacks
Crafting Adversarial Examples for Neural Machine Translation
Adversarial Learning for Discourse Rhetorical Structure Parsing
Reliability Testing for Natural Language Processing Systems
Robust Knowledge Graph Completion with Stacked Convolutions and a Student Re-Ranking Network
Towards Robustness of Text-to-SQL Models against Synonym Substitution
Improving Paraphrase Detection with the Adversarial Paraphrasing Task
MATE-KD: Masked Adversarial TExt, a Companion to Knowledge Distillation
On the Efficacy of Adversarial Data Collection for Question Answering: Results from a Large-Scale Randomized Study
WARP: Word-level Adversarial ReProgramming
Improving Arabic Diacritization with Regularized Decoding and Adversarial Training
An Empirical Study on Adversarial Attack on NMT: Languages and Positions Matter
Using Adversarial Attacks to Reveal the Statistical Bias in Machine Reading Comprehension Models
OutFlip: Generating Examples for Unknown Intent Detection with Natural Language Attack

【EMNLP】

Achieving Model Robustness through Discrete Adversarial Training[PDF]
Multi-granularity Textual Adversarial Attack with Behavior Cloning[PDF]
（评估神经语言模型对输入干扰的鲁棒性）Evaluating the Robustness of Neural Language Models to Input Perturbations[PDF]
（针对跨语言知识图谱对齐的对抗性攻击）Adversarial Attack against Cross-lingual Knowledge Graph Alignment
（针对输入鲁棒性的基于特征的对抗元嵌入）FAME: Feature-Based Adversarial Meta-Embeddings for Robust Input Representations
（通过实例归因方法对知识图谱嵌入的对抗性攻击）Adversarial Attacks on Knowledge Graph Embeddings via Instance Gradient-based Adversarial Attacks against Text TransformersAttribution Methods
（黑盒环境中查询高效攻击的强大基线）A Strong Baseline for Query Efficient Attacks in a Black Box Setting
Gradient-based Adversarial Attacks against Text Transformers[PDF]
Searching for an Effective Defender: Benchmarking Defense against Adversarial Word Substitution[PDF]
On the Transferability of Adversarial Attacks against Neural Text Classifier[PDF]
Contrasting Human- and Machine-Generated Word-Level Adversarial Examples for Text Classification[PDF]

【NAACL】

Universal Adversarial Attacks with Natural Triggers for Text Classification
Contextualized Perturbation for Textual Adversarial Attack

【AAAI】

Generating Natural Language Attacks in a Hard Label Black Box Setting
（使用快速渐变投影方法对抗基于同义词替换的文本攻击的对抗训练）Adversarial Training with Fast Gradient Projection Method against Synonym Substitution Based Text Attacks
（通过自动生成的反事实对文本分类中的虚假相关性进行鲁棒性）Robustness to Spurious Correlations in Text Classification via Automatically Generated Counterfactuals

1.2 神经机器翻译

2022

【NAACL】

Generating Authentic Adversarial Examples beyond Meaning-preserving with Doubly Round-trip Translation[PDF]

1.3 情绪分类

2021

【NAACL】

Grey-box Adversarial Attack And Defence For Sentiment Classification

2、对抗攻击的应用

2022

【NAACL】

(一句话值一千美元：对推文的对抗性攻击傻瓜股票预测)A Word is Worth A Thousand Dollars: Adversarial Attack on Tweets Fools Stock Prediction[PDF]

2021

【NAACL】

Dynamically Disentangling Social Bias from Task-Oriented Representations with Adversarial Attack
BBAEG: Towards BERT-based Biomedical Adversarial Example Generation for Text Classification

3、对抗攻击的检测和防御

2022

【NAACL】

（不为小事操心，只为其他事分类：样本屏蔽保护文本分类器免受对抗性攻击）Don’t sweat the small stuff, classify the rest: Sample Shielding to protect text classifiers against adversarial attacks[PDF]
Residue-Based Natural Language Adversarial Attack Detection[PDF]
Self-Supervised Contrastive Learning with Adversarial Perturbations for Defending Word Substitution-based Attacks[PDF]

【AAAI】

Improved Text Classification via Contrastive Adversarial Training
KATG: Keyword-Bias-Aware Adversarial Text Generation for Text Classification

4、nlp的可解释性与分析

2023

【ACL】

Entity Tracking in Language Models

5、鲁棒性提高公平性

2021

【ACL】

Does Robustness Improve Fairness? Approaching Fairness with Word Substitution Robustness Methods for Text Classification[PDF]

中文

2021

【ACL】

Correcting Chinese Spelling Errors with Phonetic Pre-training
Dynamic Connected Networks for Chinese Spelling Check

关于语言模型的攻防

2023

【ACL】

Language model acceptability judgements are not always robust to context[PDF]

2021

【ACL】

BERT-Defense: A Probabilistic Model Based on BERT to Combat Cognitively Inspired Orthographic Adversarial Attacks[PDF]
Defending Pre-trained Language Models from Adversarial Word Substitutions Without Performance Sacrifice[PDF]

优化器

2023

【ACL】

CAME: Confidence-guided Adaptive Memory Efficient Optimization[PDF]Code]

分类器

2023

【ACL】

Linear Classifier: An Often-Forgotten Baseline for Text Classification

zcq0314/about-NLP

1、对抗攻击和鲁棒性

2023

2022

2021

1.2 神经机器翻译

2022

1.3 情绪分类

2021

2、对抗攻击的应用

2022

2021

3、对抗攻击的检测和防御

2022

4、nlp的可解释性与分析

2023

5、鲁棒性提高公平性

2021

中文

2021

关于语言模型的攻防

2023

2021

优化器

2023

分类器

2023