conftrace
_
Papers
Trends
Conferences
Explore
More
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Core AI
Artificial Intelligence
›
Core AI
›
Safety
414 papers
Papers per year
2016: 1
1
2017: 1
1
2018: 4
4
2019: 8
8
2020: 11
11
2021: 21
21
2022: 29
29
2023: 36
36
2024: 87
87
2025: 117
117
2026: 99
99
Papers
DeformRS: Certifying Input Deformations with Randomized Smoothing
AAAI 2022
Safe Online Convex Optimization with Unknown Linear Safety Constraints
AAAI 2022
Distillation of RL Policies with Formal Guarantees via Variational Abstraction of Markov Decision Processes
AAAI 2022
Tight Neural Network Verification via Semidefinite Relaxations and Linear Reformulations
AAAI 2022
Stability Verification in Stochastic Control Systems via Neural Network Supermartingales
AAAI 2022
Exploring Safer Behaviors for Deep Reinforcement Learning
AAAI 2022
Preemptive Image Robustification for Protecting Users against Man-in-the-Middle Adversarial Attacks
AAAI 2022
Planning to Avoid Side Effects
AAAI 2022
Leashing the Inner Demons: Self-Detoxification for Language Models
AAAI 2022
‘Beach’ to ‘Bitch’: Inadvertent Unsafe Transcription of Kids’ Content on YouTube
AAAI 2022
SaFeRDialogues: Taking Feedback Gracefully after Conversational Safety Failures
ACL 2022
On the Safety of Conversational Models: Taxonomy, Dataset, and Benchmark
ACL 2022
Best Arm Identification with Safety Constraints
AISTATS 2022
Two Coupled Rejection Metrics Can Tell Adversarial Examples Apart
CVPR 2022
SafeText: A Benchmark for Exploring Physical Safety in Language Models
EMNLP 2022
Red Teaming Language Models with Language Models
EMNLP 2022
ProsocialDialog: A Prosocial Backbone for Conversational Agents
EMNLP 2022
Handling and Presenting Harmful Text in NLP Research
EMNLP 2022
Safe Reinforcement Learning by Imagining the Near Future
NIPS 2021
Anti-Backdoor Learning: Training Clean Models on Poisoned Data
NIPS 2021
Learning Policies with Zero or Bounded Constraint Violation for Constrained MDPs
NIPS 2021
Topological Detection of Trojaned Neural Networks
NIPS 2021
Safe Policy Optimization with Local Generalized Linear Function Approximations
NIPS 2021
Training Certifiably Robust Neural Networks with Efficient Local Lipschitz Bounds
NIPS 2021
Counterexample Guided RL Policy Refinement Using Bayesian Optimization
NIPS 2021
<
1
…
13
14
15
16
17
>