Hanlin Zhu
12 papers · 2019–2025 · 7 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+4 more ↓ Show less ↑
π£ Hot Topic Early Bird π§ Keyword Pioneer π Interdisciplinary Bridge π Conference Polyglot (7) π Academic Marathon (6)
π
Cross-Pollinator
(10)
π
Renaissance Researcher
(6)
πΊοΈ
Taxonomy Completionist
(23)
π
Century Club
(12)
Conferences
NIPS (3)
EMNLP (2)
ICLR (2)
ICML (2)
AISTATS (1)
COLT (1)
IJCNLP (1)
Top co-authors
Keywords
reward estimation
(2)
task-oriented dialog
(2)
multi-domain dialog
(2)
offline reinforcement learning
(2)
dialog policy
(2)
adversarial inverse reinforcement learning
(2)
sample complexity
(1)
in-context learning
(1)
policy optimization
(1)
empirical risk minimization
(1)
importance sampling
(1)
gradient descent
(1)
value function approximation
(1)
query complexity
(1)
inverse reinforcement learning
(1)
logical reasoning
(1)
bilinear model
(1)
regret bound
(1)
autoregressive model
(1)
policy learning
(1)
Papers
Avoiding Catastrophe in Online Learning by Asking for Help
ICML 2025
Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning
ICML 2025
Learning Personalized Alignment for Evaluating Open-ended Text Generation
EMNLP 2024
Towards a Theoretical Understanding of the 'Reversal Curse' via Training Dynamics
NIPS 2024
On Representation Complexity of Model-based and Model-free Reinforcement Learning
ICLR 2024
Provably Efficient Offline Goal-Conditioned Reinforcement Learning with General Function Approximation and Single-Policy Concentrability
NIPS 2023
Importance Weighted Actor-Critic for Optimal Conservative Offline Reinforcement Learning
NIPS 2023
Provably Efficient Reinforcement Learning via Surprise Bound
AISTATS 2023
Optimal Conservative Offline RL with General Function Approximation via Augmented Lagrangian
ICLR 2023
Average-Case Communication Complexity of Statistical Problems
COLT 2021
Guided Dialog Policy Learning: Reward Estimation for Multi-Domain Task-Oriented Dialog
EMNLP 2019
Guided Dialog Policy Learning: Reward Estimation for Multi-Domain Task-Oriented Dialog
IJCNLP 2019