Steven Basart
10 papers · 2021–2024 · 5 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+6 more ↓ Show less ↑
🐝 Cross-Pollinator (9) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌍 Conference Polyglot (5) 🌈 Renaissance Researcher (6)
🌉
Interdisciplinary Bridge
🧭
Keyword Pioneer
🤝
Dynamic Duo
(10)
👥
Mega-Team
(46)
❓
The Questioner
(3)
💎
Century Club
(10)
Conferences
ICML (4)
ICLR (2)
NIPS (2)
CVPR (1)
ICCV (1)
Top co-authors
Keywords
out-of-distribution detection
(2)
data augmentation
(2)
image classification
(2)
model robustness
(1)
video understanding
(1)
out-of-distribution generalization
(1)
ai safety
(1)
affective computing
(1)
distribution shift
(1)
spectral analysis
(1)
deep neural network
(1)
spurious correlation
(1)
model scaling
(1)
adversarial example
(1)
safety benchmark
(1)
human preference
(1)
capabilities component
(1)
commonsense reasoning
(1)
affective state
(1)
reward optimization
(1)
Papers
Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?
NIPS 2024
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
ICML 2024
The WMDP Benchmark: Measuring and Reducing Malicious Use with Unlearning
ICML 2024
Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the Machiavelli Benchmark
ICML 2023
How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios
NIPS 2022
Scaling Out-of-Distribution Detection for Real-World Settings
ICML 2022
Natural Adversarial Examples
CVPR 2021
Measuring Massive Multitask Language Understanding
ICLR 2021
The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization
ICCV 2021
Aligning AI With Shared Human Values
ICLR 2021