Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
AI Safety
2972 directly classified papers
Papers per year
2002: 1
2006: 1
2007: 1
2012: 4
2013: 1
2015: 5
2016: 1
2017: 13
2018: 40
2019: 91
2020: 111
2021: 181
2022: 204
2023: 333
2024: 642
2025: 1031
2026: 312
Papers
HAMLET4Fairness: Enhancing Fairness in AI Pipelines Through Human-Centered AutoML and Argumentation
AAAI 2026
Poisoning with a Pill: Circumventing Detection in Federated Learning
AAAI 2026
LUCID: Learning-Enabled Uncertainty-Aware Certification of Stochastic Dynamical Systems
AAAI 2026
Delphi: A Neuro-Symbolic Framework for Individualized, Safe and Interpretable Treatment Recommendation
AAAI 2026
Persistent Backdoor Attacks Under Continual Fine-Tuning of LLMs
AAAI 2026
Good Gradients Poison Your Model: Evading Defenses in Federated Learning via Boundary-adaptive Perturbation
AAAI 2026
Universal Adversarial Purification with DDIM Metric Loss for Stable Diffusion
AAAI 2026
Dual-View Inference Attack: Machine Unlearning Amplifies Privacy Exposure
AAAI 2026
MTAttack: Multi-Target Backdoor Attacks Against Large Vision-Language Models
AAAI 2026
FRBAT: Conditionally-Visible Physical Backdoor Attack via Fluorescence
AAAI 2026
Advancing Out-of-Distribution Detection Across Diverse Scenarios
AAAI 2026
Creating Blank Canvas Against AI-enabled Image Forgery
AAAI 2026
Certified but Fooled! Breaking Certified Defenses with Ghost Certificates
AAAI 2026
On Trustworthy, Explainable, and Verifiable High-Level Autonomy via Hierarchical Planning
AAAI 2026
Persistent Instability in LLM’s Personality Measurements: Effects of Scale, Reasoning, and Conversation History
AAAI 2026
Characterizing AI Manipulation Risks in Brazilian YouTube Climate Discourse
AAAI 2026
Evaluating LLMs for Police Decision-Making: A Framework Based on Police Action Scenarios
AAAI 2026
Can Editing LLMs Inject Harm?
AAAI 2026
The Emotional Baby Is Truly Deadly: Does Your Multimodal Large Reasoning Model Have Emotional Flattery Towards Humans?
AAAI 2026
PhysPatch: A Physically Realizable and Transferable Adversarial Patch Attack for Multimodal Large Language Models-based Autonomous Driving Systems
AAAI 2026
Diversifying Counterattacks: Orthogonal Exploration for Robust CLlP Inference
AAAI 2026
Selective Weak-to-Strong Generalization
AAAI 2026
VILTA: A VLM-in-the-Loop Adversary for Enhancing Driving Policy Robustness
AAAI 2026
Clean-Label Physical Backdoor Attacks with Data Distillation
AAAI 2026
EigenShield: Inference-Time, Model-Agnostic Jailbreaking Defense via Causal Subspace Filtering
AAAI 2026
<
1
…
7
8
9
…
119
>