Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
AI Safety
2972 directly classified papers
Papers per year
2002: 1
2006: 1
2007: 1
2012: 4
2013: 1
2015: 5
2016: 1
2017: 13
2018: 40
2019: 91
2020: 111
2021: 181
2022: 204
2023: 333
2024: 642
2025: 1031
2026: 312
Papers
CerDEQ: Certifiable Deep Equilibrium Model
ICML 2022
Double Sampling Randomized Smoothing
ICML 2022
Diffusion Models for Adversarial Purification
ICML 2022
A Causal Analysis of Harm
NIPS 2022
ADDMU: Detection of Far-Boundary Adversarial Examples with Data and Model Uncertainty Estimation
EMNLP 2022
Adversarial training for high-stakes reliability
NIPS 2022
Moderate-fitting as a Natural Backdoor Defender for Pre-trained Language Models
NIPS 2022
Safe Robot Learning in Assistive Devices through Neural Network Repair
CORL 2022
Safe Control Under Input Limits with Neural Control Barrier Functions
CORL 2022
A Model Selection Approach for Corruption Robust Reinforcement Learning
ALT 2022
Measuring Data Reconstruction Defenses in Collaborative Inference Systems
NIPS 2022
Regret Bounds for Risk-Sensitive Reinforcement Learning
NIPS 2022
Enhancing Safe Exploration Using Safety State Augmentation
NIPS 2022
Parametrically Retargetable Decision-Makers Tend To Seek Power
NIPS 2022
Provable Defense against Backdoor Policies in Reinforcement Learning
NIPS 2022
Shield Decentralization for Safe Multi-Agent Reinforcement Learning
NIPS 2022
Threat Scenarios and Best Practices to Detect Neural Fake News
COLING 2022
Are AlphaZero-like Agents Robust to Adversarial Perturbations?
NIPS 2022
Defining and Characterizing Reward Gaming
NIPS 2022
Constrained Update Projection Approach to Safe Policy Optimization
NIPS 2022
Double Bubble, Toil and Trouble: Enhancing Certified Robustness through Transitivity
NIPS 2022
On the Limitations of Stochastic Pre-processing Defenses
NIPS 2022
Information-Theoretic Safe Exploration with Gaussian Processes
NIPS 2022
When are Local Queries Useful for Robust Learning?
NIPS 2022
Improving Certified Robustness via Statistical Learning with Logical Reasoning
NIPS 2022
<
1
…
98
99
100
…
119
>