【ACL】
-
How do humans perceive adversarial text? A reality check on the validity and naturalness of word-based adversarial attacks[PDF]
-
Randomized Smoothing with Masked Inference for Adversarially Robust Text Classifications[PDF]
-
Text Adversarial Purification as Defense against Adversarial Attacks[PDF] (防御)
-
White-Box Multi-Objective Adversarial Attack on Dialogue Generation[PDF]
-
Contrastive Learning with Adversarial Examples for Alleviating Pathology of Language Model[PDF]
【AAAI】
- SSPAttack: A Simple and Sweet Paradigm for Black-Box Hard-Label Textual Adversarial Attack (同义词替换)
【ACL】
-
Adversarial Authorship Attribution for Deobfuscation[PDF][Code]
-
Adversarial Soft Prompt Tuning for Cross-Domain Sentiment Analysis[PDF]
-
Flooding-X: Improving BERT’s Resistance to Adversarial Attacks via LossRestricted Fine-Tuning[PDF]
-
Imputing Out-of-Vocabulary Embeddings with LOVE Makes Language Models Robust with Little Cost[PDF][Code]
-
Pass off Fish Eyes for Pearls: Attacking Model Selection of Pre-trained Models[PDF][Code]
-
SHIELD: Defending Textual Neural Networks against Multiple Black-Box[PDF][Code]
-
Towards Robustness of Text-to-SQL Models Against Natural and Realistic Adversarial Table Perturbation[PDF][Code]
【EMNLP】
- Character-level White-Box Adversarial Attacks against Transformers via Attachable Subwords Substitution[PDF][Code]
- (双答案句子攻击问答模型) TASA: Deceiving Question Answering Models by Twin Answer Sentences Attack[PDF][Code]
- Textual Manifold-based Defense Against Natural Language Adversarial Examples[PDF][Code]
- Why Should Adversarial Perturbations be Imperceptible? Rethink the Research Paradigm in Adversarial NLP[PDF]
【COLING】
- Semantic-Preserving Adversarial Code Comprehension[PDF]
- PARSE: An Efficient Search Method for Black-box Adversarial Text Attacks[PDF]
- PAEG: Phrase-level Adversarial Example Generation for Neural Machine Translation[PDF]
- (最小删除引起的神经机器翻译错误)Rare but Severe Neural Machine Translation Errors Induced by Minimal Deletion: An Empirical Study on Chinese and English[PDF]
【NAACL】
- ValCAT: Variable-Length Contextualized Adversarial Transformations Using Encoder-Decoder Language Model[PDF]
- SHARP: Search-Based Adversarial Attack for Structured Prediction
- Phrase-level Textual Adversarial Attack with Label Preservation
- Adversarial Text Normalization[PDF]
【AAAI】
- Word Level Robustness Enhancement: Fight Perturbation with Perturbation
【ACL】
-
Improving Gradient-based Adversarial Training for Text Classification by Contrastive Learning and Auto-Encoder[PDF]
-
Defense against Synonym Substitution-based Adversarial Attacks via Dirichlet Neighborhood Ensemble
-
A Sweet Rabbit Hole by DARCY: Using Honeypots to Detect Universal Trigger’s Adversarial Attacks
-
Crafting Adversarial Examples for Neural Machine Translation
-
Adversarial Learning for Discourse Rhetorical Structure Parsing
-
Reliability Testing for Natural Language Processing Systems
-
Robust Knowledge Graph Completion with Stacked Convolutions and a Student Re-Ranking Network
-
Towards Robustness of Text-to-SQL Models against Synonym Substitution
-
Improving Paraphrase Detection with the Adversarial Paraphrasing Task
-
MATE-KD: Masked Adversarial TExt, a Companion to Knowledge Distillation
-
On the Efficacy of Adversarial Data Collection for Question Answering: Results from a Large-Scale Randomized Study
-
WARP: Word-level Adversarial ReProgramming
-
Improving Arabic Diacritization with Regularized Decoding and Adversarial Training
-
An Empirical Study on Adversarial Attack on NMT: Languages and Positions Matter
-
Using Adversarial Attacks to Reveal the Statistical Bias in Machine Reading Comprehension Models
-
OutFlip: Generating Examples for Unknown Intent Detection with Natural Language Attack
【EMNLP】
- Achieving Model Robustness through Discrete Adversarial Training[PDF]
- Multi-granularity Textual Adversarial Attack with Behavior Cloning[PDF]
- (评估神经语言模型对输入干扰的鲁棒性)Evaluating the Robustness of Neural Language Models to Input Perturbations[PDF]
- (针对跨语言知识图谱对齐的对抗性攻击)Adversarial Attack against Cross-lingual Knowledge Graph Alignment
- (针对输入鲁棒性的基于特征的对抗元嵌入)FAME: Feature-Based Adversarial Meta-Embeddings for Robust Input Representations
- (通过实例归因方法对知识图谱嵌入的对抗性攻击)Adversarial Attacks on Knowledge Graph Embeddings via Instance Gradient-based Adversarial Attacks against Text TransformersAttribution Methods
- (黑盒环境中查询高效攻击的强大基线)A Strong Baseline for Query Efficient Attacks in a Black Box Setting
- Gradient-based Adversarial Attacks against Text Transformers[PDF]
- Searching for an Effective Defender: Benchmarking Defense against Adversarial Word Substitution[PDF]
- On the Transferability of Adversarial Attacks against Neural Text Classifier[PDF]
- Contrasting Human- and Machine-Generated Word-Level Adversarial Examples for Text Classification[PDF]
【NAACL】
- Universal Adversarial Attacks with Natural Triggers for Text Classification
- Contextualized Perturbation for Textual Adversarial Attack
【AAAI】
- Generating Natural Language Attacks in a Hard Label Black Box Setting
- (使用快速渐变投影方法对抗基于同义词替换的文本攻击的对抗训练)Adversarial Training with Fast Gradient Projection Method against Synonym Substitution Based Text Attacks
- (通过自动生成的反事实对文本分类中的虚假相关性进行鲁棒性)Robustness to Spurious Correlations in Text Classification via Automatically Generated Counterfactuals
【NAACL】
- Generating Authentic Adversarial Examples beyond Meaning-preserving with Doubly Round-trip Translation[PDF]
【NAACL】
- Grey-box Adversarial Attack And Defence For Sentiment Classification
【NAACL】
- (一句话值一千美元:对推文的对抗性攻击傻瓜股票预测)A Word is Worth A Thousand Dollars: Adversarial Attack on Tweets Fools Stock Prediction[PDF]
【NAACL】
- Dynamically Disentangling Social Bias from Task-Oriented Representations with Adversarial Attack
- BBAEG: Towards BERT-based Biomedical Adversarial Example Generation for Text Classification
【NAACL】
- (不为小事操心,只为其他事分类: 样本屏蔽保护文本分类器免受对抗性攻击)Don’t sweat the small stuff, classify the rest: Sample Shielding to protect text classifiers against adversarial attacks[PDF]
- Residue-Based Natural Language Adversarial Attack Detection[PDF]
- Self-Supervised Contrastive Learning with Adversarial Perturbations for Defending Word Substitution-based Attacks[PDF]
【AAAI】
- Improved Text Classification via Contrastive Adversarial Training
- KATG: Keyword-Bias-Aware Adversarial Text Generation for Text Classification
【ACL】
- Entity Tracking in Language Models
【ACL】
- Does Robustness Improve Fairness? Approaching Fairness with Word Substitution Robustness Methods for Text Classification[PDF]
【ACL】
- Correcting Chinese Spelling Errors with Phonetic Pre-training
- Dynamic Connected Networks for Chinese Spelling Check
【ACL】
- Language model acceptability judgements are not always robust to context[PDF]
【ACL】
- BERT-Defense: A Probabilistic Model Based on BERT to Combat Cognitively Inspired Orthographic Adversarial Attacks[PDF]
- Defending Pre-trained Language Models from Adversarial Word Substitutions Without Performance Sacrifice[PDF]
【ACL】
【ACL】
- Linear Classifier: An Often-Forgotten Baseline for Text Classification