Anca Dragan
64 papers · 2012–2025 · 9 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+16 more ↓ Show less ↑
π§ Keyword Pioneer π£ Hot Topic Early Bird πΊοΈ Taxonomy Completionist (15) π Interdisciplinary Bridge π Conference Polyglot (9)
π
Interdisciplinary Bridge
π
Academic Marathon
(13)
πΊοΈ
Taxonomy Completionist
(15)
π
Keyword Trendsetter Combo
(5)
π€
Dynamic Duo
(14)
π
Triple Crown
π
Keyword Champion
(4)
π
Grand Slam
π¬
Deep Specialist
(13)
ποΈ
Keyword Collector
(55)
π
Trend Setter
π₯
Unstoppable
(10)
π
Conference Pioneer
β‘
Prolific Year
(12)
π
Century Club
(64)
β
The Questioner
(2)
Conferences
ICML (15)
NIPS (14)
ICLR (13)
RSS (10)
CORL (6)
ACL (2)
IJCAI (2)
AAAI (1)
EMNLP (1)
Top co-authors
Research topics
Keywords
inverse reinforcement learning
(12)
reward function
(10)
human-robot interaction
(8)
reinforcement learning
(6)
reward learning
(5)
preference learning
(5)
motion planning
(3)
trajectory optimization
(3)
reward inference
(3)
multi-agent system
(3)
shared autonomy
(3)
value alignment
(3)
imitation learning
(2)
offline reinforcement learning
(2)
reward modeling
(2)
deep reinforcement learning
(2)
robot planning
(2)
behavior cloning
(2)
bayesian inference
(2)
game theory
(2)
Papers
Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning
ICLR 2025
AssistanceZero: Scalably Solving Assistance Games
ICML 2025
Adversaries Can Misuse Combinations of Safe Models
ICML 2025
On Targeted Manipulation and Deception when Optimizing LLMs for User Feedback
ICLR 2025
Context Steering: Controllable Personalization at Inference Time
ICLR 2025
Correlated Proxies: A New Definition and Improved Mitigation for Reward Hacking
ICLR 2025
Offline RL with Observation Histories: Analyzing and Improving Sample Complexity
ICLR 2024
Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making
ICML 2024
Learning to Model the World With Language
ICML 2024
AI Alignment with Changing and Influenceable Reward Functions
ICML 2024
Learning to Assist Humans without Inferring Rewards
NIPS 2024
When Your AIs Deceive You: Challenges of Partial Observability in Reinforcement Learning from Human Feedback
NIPS 2024
Trajectory Improvement and Reward Learning from Comparative Language Feedback
CORL 2024
Learning Optimal Advantage from Preferences and Mistaking It for Reward
AAAI 2024
Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2
EMNLP 2024
Coprocessor Actor Critic: A Model-Based Reinforcement Learning Approach For Adaptive Brain Stimulation
ICML 2024
The Effective Horizon Explains Deep RL Performance in Stochastic Environments
ICLR 2024
Confronting Reward Model Overoptimization with Constrained RLHF
ICLR 2024
Quantifying Assistive Robustness Via the Natural-Adversarial Frontier
CORL 2023
On the Sensitivity of Reward Inference to Misspecified Human Models
ICLR 2023
Causal Confusion and Reward Misidentification in Preference-Based Reward Learning
ICLR 2023
Learning to Influence Human Behavior with Offline Reinforcement Learning
NIPS 2023
Bridging RL Theory and Practice with the Effective Horizon
NIPS 2023
Automatically Auditing Large Language Models via Discrete Optimization
ICML 2023
Contextual Reliability: When Different Features Matter in Different Contexts
ICML 2023
Goal Representations for Instruction Following: A Semi-Supervised Language Interface to Control
CORL 2023
The Boltzmann Policy Distribution: Accounting for Systematic Suboptimality in Human Models
ICLR 2022
Estimating and Penalizing Induced Preference Shifts in Recommender Systems
ICML 2022
Inferring Rewards from Language in Context
ACL 2022
Uni[MASK]: Unified Inference in Sequential Decision Problems
NIPS 2022
First Contact: Unsupervised Human-Machine Co-Adaptation via Mutual Information Maximization
NIPS 2022
Learning Representations that Enable Generalization in Assistive Tasks
CORL 2022
On complementing end-to-end human behavior predictors with planning
RSS 2021
Pragmatic Image Compression for Human-in-the-Loop Decision-Making
NIPS 2021
Learning What To Do by Simulating the Past
ICLR 2021
X2T: Training an X-to-Text Typing Interface with Online Learning from User Feedback
ICLR 2021
Value Alignment Verification
ICML 2021
Policy Gradient Bayesian Robust Optimization for Imitation Learning
ICML 2021
AvE: Assistance via Empowerment
NIPS 2020
Reward-rational (implicit) choice: A unifying formalism for reward learning
NIPS 2020
Learning Human Objectives by Evaluating Hypothetical Behavior
ICML 2020
Assisted Perception: Optimizing Observations to Communicate State
CORL 2020
Preference learning along multiple criteria: A game-theoretic perspective
NIPS 2020
Preferences Implicit in the State of the World
ICLR 2019
On the Utility of Learning about Humans for Human-AI Coordination
NIPS 2019
Learning a Prior over Intent via Meta-Inverse Reinforcement Learning
ICML 2019
On the Feasibility of Learning, Rather than Assuming, Human Biases for Reward Inference
ICML 2019
An Efficient, Generalized Bellman Update For Cooperative Inverse Reinforcement Learning
ICML 2018
Probabilistically Safe Robot Planning with Confidence-Based Human Predictions
RSS 2018
Where Do You Think You're Going?: Inferring Beliefs about Dynamics from Behavior
NIPS 2018
Shared Autonomy via Deep Reinforcement Learning
RSS 2018
Simplifying Reward Design through Divide-and-Conquer
RSS 2018
Inverse Reward Design
NIPS 2017
Should Robots be Obedient?
IJCAI 2017
Active Preference-Based Learning of Reward Functions
RSS 2017
Enabling Robots to Communicate Their Objectives
RSS 2017
Translating Neuralese
ACL 2017
DART: Noise Injection for Robust Imitation Learning
CORL 2017
The Off-Switch Game
IJCAI 2017
Functional Gradient Motion Planning in Reproducing Kernel Hilbert Spaces
RSS 2016
Cooperative Inverse Reinforcement Learning
NIPS 2016
An Analysis of Deceptive Robot Motion
RSS 2014
Generating Legible Motion
RSS 2013
Formalizing Assistive Teleoperation
RSS 2012