Shenao Zhang
10 papers · 2022–2025 · 3 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+3 more ↓ Show less ↑
π Cross-Pollinator (15) π Conference Polyglot (3) π Renaissance Researcher (5) π Interdisciplinary Bridge πΊοΈ Taxonomy Completionist (11)
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π
Century Club
(10)
Conferences
ICML (5)
NIPS (4)
ACL (1)
Top co-authors
Keywords
model-based reinforcement learning
(3)
regret bound
(2)
policy optimization
(2)
policy gradient
(2)
language model alignment
(1)
posterior sampling
(1)
reinforcement learning from human feedback
(1)
robot learning
(1)
value function
(1)
reparameterization gradient
(1)
sublinear regret
(1)
model-free reinforcement learning
(1)
credit assignment
(1)
gradient variance
(1)
global optimality
(1)
adversarial regularizer
(1)
exploration efficiency
(1)
conservative policy optimization
(1)
multi-step reasoning
(1)
contact dynamics
(1)
Papers
Reward-Augmented Data Enhances Direct Preference Alignment of LLMs
ICML 2025
Offline Reinforcement Learning for LLM Multi-step Reasoning
ACL 2025
BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning
ICML 2025
Adaptive-Gradient Policy Optimization: Enhancing Policy Learning in Non-Smooth Differentiable Simulations
ICML 2024
Reason for Future, Act for Now: A Principled Architecture for Autonomous LLM Agents
ICML 2024
Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer
NIPS 2024
Adaptive Barrier Smoothing for First-Order Policy Gradient with Contact Dynamics
ICML 2023
Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration
NIPS 2023
Model-Based Reparameterization Policy Gradient Methods: Theory and Practical Algorithms
NIPS 2023
Conservative Dual Policy Optimization for Efficient Model-Based Reinforcement Learning
NIPS 2022