Joey Hong
17 papers · 2019–2025 · 5 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+8 more ↓ Show less ↑
🏃 Academic Marathon (6) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌍 Conference Polyglot (5) 🐝 Cross-Pollinator (14)
🌍
Conference Polyglot
(5)
🏃
Academic Marathon
(6)
🏆
Keyword Champion
(2)
👑
Triple Crown
🗃️
Keyword Collector
(51)
💎
Century Club
(17)
🔥
Unstoppable
(7)
❓
The Questioner
Conferences
ICLR (6)
ICML (5)
AISTATS (3)
NIPS (2)
CVPR (1)
Top co-authors
Keywords
thompson sampling
(4)
regret bound
(4)
multi-armed bandit
(3)
bayesian inference
(2)
off-policy learning
(2)
hierarchical bayesian model
(2)
latent state
(2)
contextual bandit
(2)
autonomous driving
(1)
human behavior
(1)
offline reinforcement learning
(1)
bandit feedback
(1)
behavioral adaptation
(1)
hierarchical model
(1)
policy optimization
(1)
upper confidence bound
(1)
program synthesis
(1)
trajectory prediction
(1)
online learning
(1)
multi-task learning
(1)
Papers
LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models
ICML 2025
Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning
ICLR 2025
ExeDec: Execution Decomposition for Compositional Generalization in Neural Program Synthesis
ICLR 2024
Offline RL with Observation Histories: Analyzing and Improving Sample Complexity
ICLR 2024
Learning to Explore in POMDPs with Informational Rewards
ICML 2024
On the Sensitivity of Reward Inference to Misspecified Human Models
ICLR 2023
Learning to Influence Human Behavior with Offline Reinforcement Learning
NIPS 2023
Confidence-Conditioned Value Functions for Offline Reinforcement Learning
ICLR 2023
Multi-Task Off-Policy Learning from Bandit Feedback
ICML 2023
Should I Run Offline Reinforcement Learning or Behavioral Cloning?
ICLR 2022
Hierarchical Bayesian Bandits
AISTATS 2022
Deep Hierarchy in Bandits
ICML 2022
Thompson Sampling with a Mixture Prior
AISTATS 2022
Non-Stationary Off-Policy Optimization
AISTATS 2021
Latent Programmer: Discrete Latent Codes for Program Synthesis
ICML 2021
Latent Bandits Revisited
NIPS 2020
Rules of the Road: Predicting Driving Behavior With a Convolutional Model of Semantic Interactions
CVPR 2019