Zeyu Zheng
21 papers · 2013–2026 · 5 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+7 more ↓ Show less ↑
🏃 Academic Marathon (12) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (5) 🐝 Cross-Pollinator (13)
🏃
Academic Marathon
(12)
🧭
Keyword Pioneer
🏆
Keyword Champion
(2)
🗃️
Keyword Collector
(81)
💎
Century Club
(20)
🔥
Unstoppable
(6)
❓
The Questioner
Conferences
NIPS (9)
ICML (7)
ACL (2)
AISTATS (2)
AAAI (1)
Top co-authors
Research topics
Keywords
regret bound
(6)
multi-armed bandit
(4)
reinforcement learning
(3)
online learning
(3)
intrinsic reward
(2)
neural network
(2)
worst-case optimality
(2)
deep reinforcement learning
(2)
expected regret
(2)
stochastic optimization
(2)
non-stationary environment
(2)
optimal transport
(1)
continual learning
(1)
self-supervised learning
(1)
uncertainty quantification
(1)
policy gradient
(1)
causal inference
(1)
exploration exploitation
(1)
model misspecification
(1)
gaussian process
(1)
Papers
A Comprehensive Survey of Process Reward Models: Data Generation, Model Construction, and Usage
ACL 2026
Efficient Fine-Grained Guidance for Diffusion Model Based Symbolic Music Generation
ICML 2025
Generalized Preference Optimization: A Unified Approach to Offline Alignment
ICML 2024
Normalization and effective learning rates in reinforcement learning
NIPS 2024
Stochastic Multi-Armed Bandits with Strongly Reward-Dependent Delays
AISTATS 2024
Human Alignment of Large Language Models through Online Preference Optimisation
ICML 2024
Non-stationary Experimental Design under Linear Trends
NIPS 2023
Stochastic Multi-armed Bandits: Optimal Trade-off among Optimality, Consistency, and Tail Risk
NIPS 2023
Understanding Plasticity in Neural Networks
ICML 2023
Contextual Gaussian Process Bandits with Neural Networks
NIPS 2023
A Simple and Optimal Policy Design for Online Learning with Safety against Heavy-tailed Risk
NIPS 2022
Adaptive Pairwise Weights for Temporal Credit Assignment
AAAI 2022
Gradient-Free Methods for Deterministic and Stochastic Nonsmooth Nonconvex Optimization
NIPS 2022
Dynamic Planning and Learning under Recovering Rewards
ICML 2021
Learning State Representations from Random Deep Action-conditional Predictions
NIPS 2021
On Projection Robust Optimal Transport: Sample Complexity and Model Misspecification
AISTATS 2021
Stochastic $L^\natural$-convex Function Minimization
NIPS 2021
When Demands Evolve Larger and Noisier: Learning and Earning in a Growing Environment
ICML 2020
What Can Learned Intrinsic Rewards Capture?
ICML 2020
On Learning Intrinsic Rewards for Policy Gradient Methods
NIPS 2018
Extracting Events with Informal Temporal References in Personal Histories in Online Communities
ACL 2013