conftrace_

Steven Basart

10 papers · 2021–2024 · 5 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+6 more ↓

🐝 Cross-Pollinator (9) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌍 Conference Polyglot (5) 🌈 Renaissance Researcher (6)

🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🤝 Dynamic Duo (10) 👥 Mega-Team (46) ❓ The Questioner (3) 💎 Century Club (10)

Conferences

ICML (4) ICLR (2) NIPS (2) CVPR (1) ICCV (1)

Top co-authors

Dan Hendrycks (10) Jacob Steinhardt (6) Dawn Song (6) Andy Zou (6) Mantas Mazeika (6) Alexander Pan (3) Nathaniel Li (3) Zifan Wang (2) Long Phan (2) Alice Gatti (2)

Keywords

out-of-distribution detection (2) data augmentation (2) image classification (2) model robustness (1) video understanding (1) out-of-distribution generalization (1) ai safety (1) affective computing (1) distribution shift (1) spectral analysis (1) deep neural network (1) spurious correlation (1) model scaling (1) adversarial example (1) safety benchmark (1) human preference (1) capabilities component (1) commonsense reasoning (1) affective state (1) reward optimization (1)

Papers

Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress? NIPS 2024 HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal ICML 2024 The WMDP Benchmark: Measuring and Reducing Malicious Use with Unlearning ICML 2024 Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the Machiavelli Benchmark ICML 2023 How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios NIPS 2022 Scaling Out-of-Distribution Detection for Real-World Settings ICML 2022 Natural Adversarial Examples CVPR 2021 Measuring Massive Multitask Language Understanding ICLR 2021 The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization ICCV 2021 Aligning AI With Shared Human Values ICLR 2021