Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
AI Safety
2972 directly classified papers
Papers per year
2002: 1
2006: 1
2007: 1
2012: 4
2013: 1
2015: 5
2016: 1
2017: 13
2018: 40
2019: 91
2020: 111
2021: 181
2022: 204
2023: 333
2024: 642
2025: 1031
2026: 312
Papers
Scalable Edge Blocking Algorithms for Defending Active Directory Style Attack Graphs
AAAI 2023
The Perils of Trial-and-Error Reward Design: Misdesign through Overfitting and Invalid Task Specifications
AAAI 2023
Augmented Proximal Policy Optimization for Safe Reinforcement Learning
AAAI 2023
Provably Efficient Primal-Dual Reinforcement Learning for CMDPs with Non-stationary Objectives and Constraints
AAAI 2023
Shielding in Resource-Constrained Goal POMDPs
AAAI 2023
Probabilities Are Not Enough: Formal Controller Synthesis for Stochastic Dynamical Models with Epistemic Uncertainty
AAAI 2023
Safe Reinforcement Learning via Shielding under Partial Observability
AAAI 2023
Correct-by-Construction Reinforcement Learning of Cardiac Pacemakers from Duration Calculus Requirements
AAAI 2023
Certified Policy Smoothing for Cooperative Multi-Agent Reinforcement Learning
AAAI 2023
Evaluating Model-Free Reinforcement Learning toward Safety-Critical Tasks
AAAI 2023
Safety Validation of Learning-Based Autonomous Systems: A Multi-Fidelity Approach
AAAI 2023
Safe Interactive Autonomy for Multi-Agent Systems
AAAI 2023
Towards Safe Reinforcement Learning via OOD Dynamics Detection in Autonomous Driving System (Student Abstract)
AAAI 2023
Risk-Aware Decentralized Safe Control via Dynamic Responsibility Allocation (Student Abstract)
AAAI 2023
Tackling Safe and Efficient Multi-Agent Reinforcement Learning via Dynamic Shielding (Student Abstract)
AAAI 2023
Probabilistic Reasoning and Learning for Trustworthy AI
AAAI 2023
AAAI New Faculty Highlights: General and Scalable Optimization for Robust AI
AAAI 2023
Failure-Resistant Intelligent Interaction for Reliable Human-AI Collaboration
AAAI 2023
Black-Box Adversarial Attack on Time Series Classification
AAAI 2023
On the Vulnerability of Backdoor Defenses for Federated Learning
AAAI 2023
Redactor: A Data-Centric and Individualized Defense against Inference Attacks
AAAI 2023
Local Justice and Machine Learning: Modeling and Inferring Dynamic Ethical Preferences toward Allocations
AAAI 2023
Mitigating Adversarial Norm Training with Moral Axioms
AAAI 2023
Responsible Robotics: A Socio-Ethical Addition to Robotics Courses
AAAI 2023
AI Audit: A Card Game to Reflect on Everyday AI Systems
AAAI 2023
<
1
…
90
91
92
…
119
>