Nathaniel Li
3 papers · 2023–2024 · 1 conference · across top CS/AI conferences
Achievements
Jump to papers ↓
🌉
Interdisciplinary Bridge
🐝
Cross-Pollinator
(10)
👥
Mega-Team
(46)
❓
The Questioner
Conferences
ICML (3)
Top co-authors
Papers
The WMDP Benchmark: Measuring and Reducing Malicious Use with Unlearning
ICML 2024
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
ICML 2024
Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the Machiavelli Benchmark
ICML 2023