conftrace_

Stuart Russell

52 papers · 2008–2025 · 10 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+16 more ↓

🗺️ Taxonomy Completionist (23) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (9) 🌍 Conference Polyglot (10)

🌈 Renaissance Researcher (9) 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (17) 🌟 Keyword Trendsetter Combo (7) 🏆 Keyword Champion (2) 👑 Triple Crown 🧬 Topic Evolution 🤝 Dynamic Duo (11) 🏆 Grand Slam ❓ The Questioner (3) 🗃️ Keyword Collector (57) 📈 Trend Setter 🔥 Unstoppable (13) 🚀 Conference Pioneer ⚡ Prolific Year (10) 💎 Century Club (52)

Conferences

ICML (14) NIPS (13) ICLR (10) IJCAI (5) AISTATS (4) AAAI (2) CORL (1) EMNLP (1) ICCV (1) UAI (1)

Top co-authors

Anca Dragan (11) Dylan Hadfield-Menell (6) Yi Wu (6) Scott Emmons (6) Pieter Abbeel (5) Adam Gleave (4) Hanlin Zhu (4) Michael D Dennis (4) Vincent Conitzer (3) Lei Li (3)

Keywords

bayesian inference (6) game theory (4) multi-agent system (4) partial observability (3) reward function (3) time series (3) generative model (2) representation learning (2) utility function (2) event detection (2) importance sampling (2) posterior inference (2) gaussian process (2) reward learning (2) adversarial training (2) approximate inference (2) human-robot interaction (2) markov chain monte carlo (2) inverse reinforcement learning (2) probabilistic programming (2)

Papers

Monitoring Latent World States in Language Models with Propositional Probes ICLR 2025 Extractive Structures Learned in Pretraining Enable Generalization on Finetuned Facts ICML 2025 Observation Interference in Partially Observable Assistance Games ICML 2025 RL, but don’t do anything I wouldn’t do UAI 2025 Avoiding Catastrophe in Online Learning by Asking for Help ICML 2025 AssistanceZero: Scalably Solving Assistance Games ICML 2025 The Partially Observable Off-Switch Game AAAI 2025 BAMDP Shaping: a Unified Framework for Intrinsic Motivation and Reward Shaping ICLR 2025 Diffusion On Syntax Trees For Program Synthesis ICLR 2025 The Effective Horizon Explains Deep RL Performance in Stochastic Environments ICLR 2024 Evidence of Learned Look-Ahead in a Chess-Playing Neural Network NIPS 2024 Towards a Theoretical Understanding of the 'Reversal Curse' via Training Dynamics NIPS 2024 When Your AIs Deceive You: Challenges of Partial Observability in Reinforcement Learning from Human Feedback NIPS 2024 Trajectory Improvement and Reward Learning from Comparative Language Feedback CORL 2024 Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game ICLR 2024 AI Alignment with Changing and Influenceable Reward Functions ICML 2024 Image Hijacks: Adversarial Images can Control Generative Models at Runtime ICML 2024 Position: Social Choice Should Guide AI Alignment in Dealing with Diverse Human Feedback ICML 2024 On Representation Complexity of Model-based and Model-free Reinforcement Learning ICLR 2024 SMCP3: Sequential Monte Carlo with Probabilistic Program Proposals AISTATS 2023 Invariance in Policy Optimisation and Partial Identifiability in Reward Learning ICML 2023 Who Needs to Know? Minimal Knowledge for Optimal Coordination ICML 2023 Optimal Conservative Offline RL with General Function Approximation via Augmented Lagrangian ICLR 2023 Adversarial Policies Beat Superhuman Go AIs ICML 2023 For Learning in Symmetric Teams, Local Optima are Global Nash Equilibria ICML 2022 Estimating and Penalizing Induced Preference Shifts in Recommender Systems ICML 2022 Cross-Domain Imitation Learning via Optimal Transport ICLR 2022 Quantifying Differences in Reward Functions ICLR 2021 Adversarial Policies: Attacking Deep Reinforcement Learning ICLR 2020 Bayesian Relational Memory for Semantic Visual Navigation ICCV 2019 Robust Multi-Agent Reinforcement Learning via Minimax Deep Deterministic Policy Gradient AAAI 2019 An Efficient, Generalized Bellman Update For Cooperative Inverse Reinforcement Learning ICML 2018 Learning Plannable Representations with Causal InfoGAN NIPS 2018 Meta-Learning MCMC Proposals NIPS 2018 Negotiable Reinforcement Learning for Pareto Optimal Sequential Decision-Making NIPS 2018 Discrete-Continuous Mixtures in Probabilistic Programming: Generalized Semantics and Inference Algorithms ICML 2018 Efficient Reinforcement Learning with Hierarchies of Machines by Leveraging Internal Transitions IJCAI 2017 Should Robots be Obedient? IJCAI 2017 Signal-based Bayesian Seismic Monitoring AISTATS 2017 Adversarial Training for Relation Extraction EMNLP 2017 Inverse Reward Design NIPS 2017 The Off-Switch Game IJCAI 2017 Markovian State and Action Abstractions for MDPs via Hierarchical MCTS IJCAI 2016 Swift: Compiled Inference for Probabilistic Programming Languages IJCAI 2016 Cooperative Inverse Reinforcement Learning NIPS 2016 Gaussian Process Random Fields NIPS 2015 Algorithm selection by rational metareasoning as a model of human strategy selection NIPS 2014 Multilinear Dynamical Systems for Tensor Time Series NIPS 2013 Dynamic Scaled Sampling for Deterministic Constraints AISTATS 2013 Global seismic monitoring as probabilistic inference NIPS 2010 Why are DBNs sparse? AISTATS 2010 Probabilistic detection of short events, with application to critical care monitoring NIPS 2008