Jan Leike
14 papers · 2015–2025 · 6 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+11 more ↓ Show less ↑
π Academic Marathon (10) π§ Keyword Pioneer π Interdisciplinary Bridge π Conference Polyglot (6) π Cross-Pollinator (11)
π
Academic Marathon
(10)
πΊοΈ
Taxonomy Completionist
(25)
π
Renaissance Researcher
(6)
π
Keyword Trendsetter Combo
(3)
π
The Namer
π§¬
Topic Evolution
π₯
Mega-Team
(20)
π
Century Club
(14)
π₯
Unstoppable
(8)
π
Trend Setter
ποΈ
Keyword Collector
(53)
Conferences
ICLR (4)
IJCAI (3)
NIPS (3)
ICML (2)
AISTATS (1)
COLT (1)
Top co-authors
Keywords
preference learning
(3)
reward function
(3)
inverse reinforcement learning
(2)
reinforcement learning
(2)
reward model
(2)
deep reinforcement learning
(2)
reward learning
(2)
human preference
(2)
sequential decision making
(1)
reinforcement learning from human feedback
(1)
bayesian inference
(1)
ai safety
(1)
model alignment
(1)
trajectory optimization
(1)
language model alignment
(1)
imitation learning
(1)
value of information
(1)
demonstration learning
(1)
sequence prediction
(1)
kolmogorov complexity
(1)
Papers
Scaling and evaluating sparse autoencoders
ICLR 2025
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision
ICML 2024
Let's Verify Step by Step
ICLR 2024
Training language models to follow instructions with human feedback
NIPS 2022
Quantifying Differences in Reward Functions
ICLR 2021
Pitfalls of Learning a Reward Function Online
IJCAI 2020
Learning Human Objectives by Evaluating Hypothetical Behavior
ICML 2020
Learning to Understand Goal Specifications by Modelling Reward
ICLR 2019
Reward learning from human preferences and demonstrations in Atari
NIPS 2018
Deep Reinforcement Learning from Human Preferences
NIPS 2017
Universal Reinforcement Learning Algorithms: Survey and Experiments
IJCAI 2017
On Thompson Sampling and Asymptotic Optimality
IJCAI 2017
Loss Bounds and Time Complexity for Speed Priors
AISTATS 2016
Bad Universal Priors and Notions of Optimality
COLT 2015