conftrace
_
Papers
Trends
Conferences
Explore
Authors
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Learning Types
Machine Learning
›
Learning Types
›
Adversarial Learning
4,854 papers
Papers per year
2006: 3
2007: 1
2009: 4
2010: 6
2011: 3
2012: 5
2013: 10
2014: 6
2015: 8
2016: 18
2017: 87
2018: 261
2019: 551
2020: 588
2021: 703
2022: 633
2023: 672
2024: 579
2025: 561
2026: 155
Papers
[MASK] Insertion: a robust method for anti-adversarial attacks
EACL 2023
Prompting for explanations improves Adversarial NLI. Is this true? {Yes} it is {true} because {it weakens superficial cues}
EACL 2023
Unveiling the Implicit Toxicity in Large Language Models
EMNLP 2023
Lion: Adversarial Distillation of Proprietary Large Language Models
EMNLP 2023
TrojanSQL: SQL Injection against Natural Language Interface to Database
EMNLP 2023
CT-GAT: Cross-Task Generative Adversarial Attack based on Transferability
EMNLP 2023
Hidding the Ghostwriters: An Adversarial Evaluation of AI-Generated Student Essay Detection
EMNLP 2023
Prompt as Triggers for Backdoor Attack: Examining the Vulnerability in Language Models
EMNLP 2023
MeaeQ: Mount Model Extraction Attacks with Efficient Queries
EMNLP 2023
“Are Your Explanations Reliable?” Investigating the Stability of LIME in Explaining Text Classifiers by Marrying XAI and Adversarial Attack
EMNLP 2023
Generative Adversarial Training with Perturbed Token Detection for Model Robustness
EMNLP 2023
Poisoning Retrieval Corpora by Injecting Adversarial Passages
EMNLP 2023
RobustQA: A Framework for Adversarial Text Generation Analysis on Question Answering Systems
EMNLP 2023
Improving Classifier Robustness through Active Generative Counterfactual Data Augmentation
EMNLP 2023
How Reliable Are AI-Generated-Text Detectors? An Assessment Framework Using Evasive Soft Prompts
EMNLP 2023
Attack Prompt Generation for Red Teaming and Defending Large Language Models
EMNLP 2023
No offence, Bert - I insult only humans! Multilingual sentence-level attack on toxicity detection networks
EMNLP 2023
RoAST: Robustifying Language Models via Adversarial Perturbation with Selective Training
EMNLP 2023
Multi-step Jailbreaking Privacy Attacks on ChatGPT
EMNLP 2023
Robustness Tests for Automatic Machine Translation Metrics with Adversarial Attacks
EMNLP 2023
ASSERT: Automated Safety Scenario Red Teaming for Evaluating the Robustness of Large Language Models
EMNLP 2023
Effects of Human Adversarial and Affable Samples on BERT Generalization
EMNLP 2023
Guiding LLM to Fool Itself: Automatically Manipulating Machine Reading Comprehension Shortcut Triggers
EMNLP 2023
Are Personalized Stochastic Parrots More Dangerous? Evaluating Persona Biases in Dialogue Systems
EMNLP 2023
A Black-Box Attack on Code Models via Representation Nearest Neighbor Search
EMNLP 2023
<
1
…
68
69
70
…
195
>