adversarial robustness

1335 papers

Explore in graph

Also known as

UAP PAR ADV AR

Co-occurring keywords

adversarial training (1261) adversarial attack (1599) neural network (6616) adversarial example (563) adversarial learning (1592) model robustness (478) adversarial defense (324) large language model (12755) certified robustness (116) adversarial perturbation (376)

Papers

Threading the Needle of On and Off-Manifold Value Functions for Shapley Explanations AISTATS 2022

Enhancing Tabular Reasoning with Pattern Exploiting Training IJCNLP 2022

Robust Hate Speech Detection via Mitigating Spurious Correlations IJCNLP 2022

Where to Attack: A Dynamic Locator Model for Backdoor Attack in Text Classifications COLING 2022

A Word is Worth A Thousand Dollars: Adversarial Attack on Tweets Fools Stock Prediction NAACL 2022

Robust Lottery Tickets for Pre-trained Language Models ACL 2022

On the Robustness of Offensive Language Classifiers ACL 2022

Towards Adversarially Robust Text Classifiers by Learning to Reweight Clean Examples ACL 2022

Investigating Selective Prediction Approaches Across Several Tasks in IID, OOD, and Adversarial Settings ACL 2022

Cross-Domain Detection of GPT-2-Generated Technical Text NAACL 2022

ADDMU: Detection of Far-Boundary Adversarial Examples with Data and Model Uncertainty Estimation EMNLP 2022

Can Rationalization Improve Robustness? NAACL 2022

Provable Adversarial Robustness for Fractional Lp Threat Models AISTATS 2022

Improving the Adversarial Robustness of NLP Models by Information Bottleneck ACL 2022

Measure and Improve Robustness in NLP Models: A Survey NAACL 2022

Towards Robustness of Text-to-SQL Models Against Natural and Realistic Adversarial Table Perturbation ACL 2022

Generalized but not Robust? Comparing the Effects of Data Modification Methods on Out-of-Domain Generalization and Adversarial Robustness ACL 2022

A Study of the Attention Abnormality in Trojaned BERTs NAACL 2022

Detecting Word-Level Adversarial Text Attacks via SHapley Additive exPlanations ACL 2022

Weight Perturbation as Defense against Adversarial Word Substitutions EMNLP 2022

Don’t sweat the small stuff, classify the rest: Sample Shielding to protect text classifiers against adversarial attacks NAACL 2022

Enhancing Tabular Reasoning with Pattern Exploiting Training AACL 2022

Perturbation type categorization for multiple adversarial perturbation robustness UAI 2022

Robust Image Forgery Detection Over Online Social Network Shared Images CVPR 2022

Efficient Training of Low-Curvature Neural Networks NIPS 2022