Shimon Whiteson
80 papers · 2006–2024 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+18 more ↓ Show less ↑
π§ Keyword Pioneer π£ Hot Topic Early Bird π Interdisciplinary Bridge πΊοΈ Taxonomy Completionist (19) π Conference Polyglot (8)
π
Renaissance Researcher
(9)
π
Interdisciplinary Bridge
π
Cross-Pollinator
(3)
π
Conference Loyalist
(26)
π€
Dynamic Duo
(17)
π
Triple Crown
π¬
Deep Specialist
(10)
π
Keyword Champion
(7)
π
Grand Slam
π₯
Mega-Team
(21)
π§¬
Topic Evolution
π
Trend Setter
π
Conference Pioneer
π₯
Unstoppable
(11)
β
The Questioner
(2)
π
Century Club
(80)
ποΈ
Keyword Collector
(83)
β‘
Prolific Year
(9)
Conferences
ICML (27)
NIPS (26)
ICLR (9)
JMLR (7)
CORL (4)
IJCAI (4)
AAAI (2)
UAI (1)
Top co-authors
Keywords
multi-agent reinforcement learning
(15)
reinforcement learning
(9)
off-policy learning
(8)
function approximation
(7)
deep reinforcement learning
(5)
policy gradient
(5)
sample efficiency
(5)
value function
(5)
target network
(4)
temporal difference learning
(4)
continuous control
(4)
imitation learning
(4)
multi-agent system
(4)
meta-reinforcement learning
(3)
cooperative multi-agent
(3)
off-policy reinforcement learning
(3)
multi-task learning
(3)
curriculum learning
(3)
variance reduction
(3)
partial observability
(3)
Papers
Bayesian Exploration Networks
ICML 2024
JaxMARL: Multi-Agent RL Environments and Algorithms in JAX
NIPS 2024
Can Learned Optimization Make Reinforcement Learning Less Difficult?
NIPS 2024
Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control
ICML 2024
Rate-Informed Discovery via Bayesian Adaptive Multifidelity Sampling
CORL 2024
Adam on Local Time: Addressing Nonstationarity in RL with Relative Adam Timesteps
NIPS 2024
Discovering Temporally-Aware Reinforcement Learning Algorithms
ICLR 2024
Cheap Talk Discovery and Utilization in Multi-Agent Reinforcement Learning
ICLR 2023
Universal Morphology Control via Contextual Modulation
ICML 2023
Why Target Networks Stabilise Temporal Difference Methods
ICML 2023
SMACv2: An Improved Benchmark for Cooperative Multi-Agent Reinforcement Learning
NIPS 2023
The Waymo Open Sim Agents Challenge
NIPS 2023
Recurrent Hypernetworks are Surprisingly Strong in Meta-RL
NIPS 2023
Discovering General Reinforcement Learning Algorithms with Adversarial Environment Design
NIPS 2023
Generalized Beliefs for Cooperative AI
ICML 2022
Deterministic and Discriminative Imitation (D2-Imitation): Revisiting Adversarial Imitation for Sample Efficiency
AAAI 2022
Communicating via Markov Decision Processes
ICML 2022
Equivariant Networks for Zero-Shot Coordination
NIPS 2022
Embedding Synthetic Off-Policy Experience for Autonomous Driving via Zero-Shot Curricula
CORL 2022
Truncated Emphatic Temporal Difference Methods for Prediction and Control
JMLR 2022
Particle-Based Score Estimation for State Space Model Learning in Autonomous Driving
CORL 2022
Hypernetworks in Meta-Reinforcement Learning
CORL 2022
In Defense of the Unitary Scalarization for Deep Multi-Task Learning
NIPS 2022
Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning
AAAI 2021
Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning
ICML 2021
Breaking the Deadly Triad with a Target Network
ICML 2021
Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning
ICML 2021
VariBAD: Variational Bayes-Adaptive Deep RL via Meta-Learning
JMLR 2021
Deep Residual Reinforcement Learning (Extended Abstract)
IJCAI 2021
FACMAC: Factored Multi-Agent Centralised Policy Gradients
NIPS 2021
Regularized Softmax Deep Multi-Agent Q-Learning
NIPS 2021
Bayesian Bellman Operators
NIPS 2021
Snowflake: Scaling GNNs to high-dimensional continuous control via parameter freezing
NIPS 2021
Average-Reward Off-Policy Policy Evaluation with Function Approximation
ICML 2021
Transient Non-stationarity and Generalisation in Deep Reinforcement Learning
ICLR 2021
My Body is a Cage: the Role of Morphology in Graph-Based Incompatible Control
ICLR 2021
RODE: Learning Roles to Decompose Multi-Agent Tasks
ICLR 2021
UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning
ICML 2021
Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning
ICML 2021
Multitask Soft Option Learning
UAI 2020
Can Q-Learning with Graph Networks Learn a Generalizable Branching Heuristic for a SAT Solver?
NIPS 2020
Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
NIPS 2020
Learning Retrospective Knowledge with Reverse Reinforcement Learning
NIPS 2020
VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning
ICLR 2020
Optimistic Exploration even with a Pessimistic Initialisation
ICLR 2020
Deep Coordination Graphs
ICML 2020
Growing Action Spaces
ICML 2020
GradientDICE: Rethinking Generalized Offline Estimation of Stationary Values
ICML 2020
Provably Convergent Two-Timescale Off-Policy Actor-Critic with Function Approximation
ICML 2020
Expected Policy Gradients for Reinforcement Learning
JMLR 2020
Robust Reinforcement Learning with Bayesian Optimisation and Quadrature
JMLR 2020
Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
JMLR 2020
Fast Context Adaptation via Meta-Learning
ICML 2019
Stable Opponent Shaping in Differentiable Games
ICLR 2019
Multi-Agent Common Knowledge Reinforcement Learning
NIPS 2019
MAVEN: Multi-Agent Variational Exploration
NIPS 2019
Fast Efficient Hyperparameter Tuning for Policy Gradient Methods
NIPS 2019
Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Gradient Estimators for Reinforcement Learning
NIPS 2019
VIREL: A Variational Inference Framework for Reinforcement Learning
NIPS 2019
DAC: The Double Actor-Critic Architecture for Learning Options
NIPS 2019
A Survey of Reinforcement Learning Informed by Natural Language
IJCAI 2019
Generalized Off-Policy Actor-Critic
NIPS 2019
Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning
ICML 2019
A Baseline for Any Order Gradient Estimation in Stochastic Computation Graphs
ICML 2019
Fingerprint Policy Optimisation for Robust Reinforcement Learning
ICML 2019
TACO: Learning Task Decomposition via Temporal Alignment for Control
ICML 2018
Deep Variational Reinforcement Learning for POMDPs
ICML 2018
QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
ICML 2018
DiCE: The Infinitely Differentiable Monte Carlo Estimator
ICML 2018
Fourier Policy Gradients
ICML 2018
TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning
ICLR 2018
Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning
ICML 2017
Dynamic-Depth Context Tree Weighting
NIPS 2017
PAC Greedy Maximization with Efficient Bounds on Information Gain for Sensor Selection
IJCAI 2016
Learning to Communicate with Deep Multi-Agent Reinforcement Learning
NIPS 2016
Copeland Dueling Bandits
NIPS 2015
Point-Based Planning for Multi-Objective POMDPs
IJCAI 2015
Relative Upper Confidence Bound for the K-Armed Dueling Bandit Problem
ICML 2014
Exploiting Best-Match Equations for Efficient Reinforcement Learning
JMLR 2011
Evolutionary Function Approximation for Reinforcement Learning
JMLR 2006