John Schulman
19 papers · 2013–2024 · 5 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+11 more ↓ Show less ↑
🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (5) 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🏃 Academic Marathon (11)
🐣
Hot Topic Early Bird
🌉
Interdisciplinary Bridge
🌍
Conference Polyglot
(5)
🔬
Deep Specialist
(10)
🧬
Topic Evolution
👥
Mega-Team
(20)
🗃️
Keyword Collector
(76)
📈
Trend Setter
💎
Century Club
(19)
🔥
Unstoppable
(10)
🚀
Conference Pioneer
Conferences
ICML (7)
NIPS (7)
ICLR (2)
RSS (2)
CORL (1)
Top co-authors
Keywords
deep reinforcement learning
(5)
reinforcement learning
(5)
policy gradient
(4)
sample efficiency
(3)
reward model
(3)
data augmentation
(2)
reinforcement learning from human feedback
(2)
exploration bonus
(2)
intrinsic motivation
(2)
neural network
(2)
continuous control
(2)
instruction following
(1)
image generation
(1)
language model alignment
(1)
motion planning
(1)
variational inference
(1)
model alignment
(1)
domain generalization
(1)
gradient estimation
(1)
policy optimization
(1)
Papers
Let's Verify Step by Step
ICLR 2024
Rule Based Rewards for Language Model Safety
NIPS 2024
Scaling Laws for Reward Model Overoptimization
ICML 2023
Batch size-invariance for policy optimization
NIPS 2022
Training language models to follow instructions with human feedback
NIPS 2022
Phasic Policy Gradient
ICML 2021
Leveraging Procedural Generation to Benchmark Reinforcement Learning
ICML 2020
Distribution Augmentation for Generative Modeling
ICML 2020
Quantifying Generalization in Reinforcement Learning
ICML 2019
Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations
RSS 2018
Model-Based Reinforcement Learning via Meta-Policy Optimization
CORL 2018
META LEARNING SHARED HIERARCHIES
ICLR 2018
#Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning
NIPS 2017
VIME: Variational Information Maximizing Exploration
NIPS 2016
Benchmarking Deep Reinforcement Learning for Continuous Control
ICML 2016
InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets
NIPS 2016
Gradient Estimation Using Stochastic Computation Graphs
NIPS 2015
Trust Region Policy Optimization
ICML 2015
Finding Locally Optimal, Collision-Free Trajectories with Sequential Convex Optimization
RSS 2013