W. Bradley Knox
8 papers · 2021–2026 · 3 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+1 more ↓ Show less ↑
π Conference Polyglot (3) π Academic Marathon (5) π Interdisciplinary Bridge πΊοΈ Taxonomy Completionist (15) π§ Keyword Pioneer
π
Cross-Pollinator
(15)
Conferences
AAAI (5)
ICLR (2)
NIPS (1)
Top co-authors
Research topics
Keywords
reinforcement learning
(4)
reward design
(2)
reward function
(2)
direct preference optimization
(1)
autonomous driving
(1)
policy learning
(1)
reward learning
(1)
robot manipulation
(1)
cost function
(1)
reward hacking
(1)
language model
(1)
reward model
(1)
autonomous agent
(1)
autonomous vehicle
(1)
sparse reward
(1)
safety evaluation
(1)
risk assessment
(1)
prompt injection
(1)
reward inference
(1)
llm agent
(1)
Papers
MobileSafetyBench: Evaluating Safety of Autonomous Agents in Mobile Device Control
AAAI 2026
Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions
ICLR 2025
Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms
NIPS 2024
Contrastive Preference Learning: Learning from Human Feedback without Reinforcement Learning
ICLR 2024
Reward (Mis)design for Autonomous Driving (Abstract Reprint)
AAAI 2024
Learning Optimal Advantage from Preferences and Mistaking It for Reward
AAAI 2024
The Perils of Trial-and-Error Reward Design: Misdesign through Overfitting and Invalid Task Specifications
AAAI 2023
Demonstration of the EMPATHIC Framework for Task Learning from Implicit Human Feedback
AAAI 2021