Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
AI Safety
2972 directly classified papers
Papers per year
2002: 1
2006: 1
2007: 1
2012: 4
2013: 1
2015: 5
2016: 1
2017: 13
2018: 40
2019: 91
2020: 111
2021: 181
2022: 204
2023: 333
2024: 642
2025: 1031
2026: 312
Papers
Breaking Barriers in Physical-World Adversarial Examples: Improving Robustness and Transferability via Robust Feature
AAAI 2025
Adversarial-Inspired Backdoor Defense via Bridging Backdoor and Adversarial Attacks
AAAI 2025
Everywhere Attack: Attacking Locally and Globally to Boost Targeted Transferability
AAAI 2025
Backdoor Attack on Propagation-based Rumor Detectors
AAAI 2025
Crossfire: An Elastic Defense Framework for Graph Neural Networks Under Bit Flip Attacks
AAAI 2025
Graph Agent Network: Empowering Nodes with Inference Capabilities for Adversarial Resilience
AAAI 2025
Grimm: A Plug-and-Play Perturbation Rectifier for Graph Neural Networks Defending Against Poisoning Attacks
AAAI 2025
Protecting Model Adaptation from Trojans in the Unlabeled Data
AAAI 2025
Quantitative Predictive Monitoring and Control for Safe Human-Machine Interaction
AAAI 2025
Quantifying Misalignment Between Agents: Towards a Sociotechnical Understanding of Alignment
AAAI 2025
Debate Helps Weak-to-Strong Generalization
AAAI 2025
Is Your Autonomous Vehicle Safe? Understanding the Threat of Electromagnetic Signal Injection Attacks on Traffic Scene Perception
AAAI 2025
Single Character Perturbations Break LLM Alignment
AAAI 2025
Towards Computational Foreseeability
AAAI 2025
From Gambits to Assurances: Game-Theoretic Integration of Safety and Learning for Interactive Robotics
AAAI 2025
Identifying Predictions That Influence the Future: Detecting Performative Concept Drift in Data Streams
AAAI 2025
On the Robustness of Distributed Machine Learning Against Transfer Attacks
AAAI 2025
Fusing Pruned and Backdoored Models: Optimal Transport-based Data-free Backdoor Mitigation
AAAI 2025
Operationalising Rawlsian Ethics for Fairness in Norm Learning Agents
AAAI 2025
Increased Compute Efficiency and the Diffusion of AI Capabilities
AAAI 2025
Measuring Error Alignment for Decision-Making Systems
AAAI 2025
All You Need Is S P A C E: When Jailbreaking Meets Bias Audit and Reveals What Lies Beneath the Guardrails (Student Abstract)
AAAI 2025
Guaranteeing Out-Of-Distribution Detection in Deep RL via Transition Estimation
AAAI 2025
Safe Planner: Empowering Safety Awareness in Large Pre-Trained Models for Robot Task Planning
AAAI 2025
ASP-Driven Emergency Planning for Norm Violations in Reinforcement Learning
AAAI 2025
<
1
…
50
51
52
…
119
>