Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
AI Safety
2972 directly classified papers
Papers per year
2002: 1
2006: 1
2007: 1
2012: 4
2013: 1
2015: 5
2016: 1
2017: 13
2018: 40
2019: 91
2020: 111
2021: 181
2022: 204
2023: 333
2024: 642
2025: 1031
2026: 312
Papers
Investigations of Performance and Bias in Human-AI Teamwork in Hiring
AAAI 2022
Exploring the Vulnerability of Deep Reinforcement Learning-based Emergency Control for Low Carbon Power Systems
IJCAI 2022
NaturalAdversaries: Can Naturalistic Adversaries Be as Effective as Artificial Adversaries?
EMNLP 2022
Fine-mixing: Mitigating Backdoors in Fine-tuned Language Models
EMNLP 2022
Data poisoning attacks on off-policy policy evaluation methods
UAI 2022
Defending against Model Stealing via Verifying Embedded External Features
AAAI 2022
Planning to Avoid Side Effects
AAAI 2022
Dim-Krum: Backdoor-Resistant Federated Learning for NLP with Dimension-wise Krum-Based Aggregation
EMNLP 2022
Task-Relevant Failure Detection for Trajectory Predictors in Autonomous Vehicles
CORL 2022
Why Should Adversarial Perturbations be Imperceptible? Rethink the Research Paradigm in Adversarial NLP
EMNLP 2022
Reducing Offensive Replies in Open Domain Dialogue Systems
INTERSPEECH 2022
Tight Neural Network Verification via Semidefinite Relaxations and Linear Reformulations
AAAI 2022
Hibernated Backdoor: A Mutual Information Empowered Backdoor Attack to Deep Neural Networks
AAAI 2022
Textual Backdoor Attacks Can Be More Harmful via Two Simple Tricks
EMNLP 2022
Delivering Trustworthy AI through Formal XAI
AAAI 2022
Linearity Grafting: Relaxed Neuron Pruning Helps Certifiable Robustness
ICML 2022
Certified Adversarial Robustness Under the Bounded Support Set
ICML 2022
Resiliency of Perception-Based Controllers Against Attacks
L4DC 2022
Barrier Bayesian Linear Regression: Online Learning of Control Barrier Conditions for Safety-Critical Control of Uncertain Systems
L4DC 2022
Adversarially Robust Stability Certificates can be Sample-Efficient
L4DC 2022
Data-Driven Safety Verification of Stochastic Systems via Barrier Certificates: A Wait-and-Judge Approach
L4DC 2022
Out of Distribution Detection via Neural Network Anchoring
ACML 2022
The Battlefront of Combating Misinformation and Coping with Media Bias
AACL 2022
Robust Hate Speech Detection via Mitigating Spurious Correlations
AACL 2022
UKP-SQuARE v2: Explainability and Adversarial Attacks for Trustworthy QA
AACL 2022
<
1
…
97
98
99
…
119
>