conftrace_

Han Zhong

29 papers · 2021–2025 · 5 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+9 more ↓

🌍 Conference Polyglot (5) 🐝 Cross-Pollinator (13) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (5)

🌈 Renaissance Researcher (5) 🗺️ Taxonomy Completionist (31) 🤝 Dynamic Duo (15) 👑 Triple Crown 🗃️ Keyword Collector (76) 🔥 Unstoppable (5) 💎 Century Club (29) ⚡ Prolific Year (9) ❓ The Questioner

Conferences

ICML (12) NIPS (10) ICLR (5) AISTATS (1) JMLR (1)

Top co-authors

Liwei Wang (15) Tong Zhang (9) Zhaoran Wang (7) Zhuoran Yang (7) Wei Xiong (6) Jiachen Hu (3) Miao Lu (3) Tianhao Wu (3) Yunchang Yang (3) Lin Yang (3)

Keywords

regret bound (9) markov game (4) function approximation (4) reinforcement learning (3) posterior sampling (2) offline reinforcement learning (2) sample efficiency (2) linear bandit (2) model-based reinforcement learning (2) sample complexity (2) multi-armed bandit (2) minimax optimization (1) computational complexity (1) equilibrium learning (1) policy optimization (1) robust statistics (1) adversarial robustness (1) nash equilibrium (1) linear function approximation (1) vc dimension (1)

Papers

The Sample Complexity of Online Strategic Decision Making with Information Asymmetry and Knowledge Transportability ICML 2025 DPO Meets PPO: Reinforced Token Optimization for RLHF ICML 2025 BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning ICML 2025 Provably Efficient Exploration in Quantum Reinforcement Learning with Logarithmic Worst-Case Regret ICML 2024 Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment ICML 2024 Rethinking Model-based, Policy-based, and Value-based Reinforcement Learning via the Lens of Representation Complexity NIPS 2024 A3S: A General Active Clustering Method with Pairwise Constraints ICML 2024 Combinatorial Multivariant Multi-Armed Bandits with Applications to Episodic Reinforcement Learning and Beyond ICML 2024 Distributionally Robust Reinforcement Learning with Interactive Data Collection: Fundamental Hardness and Near-Optimal Algorithms NIPS 2024 Horizon-Free and Instance-Dependent Regret Bounds for Reinforcement Learning with General Function Approximation AISTATS 2024 Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-constraint ICML 2024 Towards Robust Offline Reinforcement Learning under Diverse Data Corruption ICLR 2024 Sample-efficient Learning of Infinite-horizon Average-reward MDPs with General Function Approximation ICLR 2024 Can Reinforcement Learning Find Stackelberg-Nash Equilibria in General-Sum Markov Games with Myopically Rational Followers? JMLR 2023 Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration NIPS 2023 Posterior Sampling for Competitive RL: Function Approximation and Partial Observation NIPS 2023 A Reduction-based Framework for Sequential Decision Making with Delayed Feedback NIPS 2023 Tackling Heavy-Tailed Rewards in Reinforcement Learning with Function Approximation: Minimax Optimal and Instance-Dependent Regret Bounds NIPS 2023 Double Pessimism is Provably Efficient for Distributionally Robust Offline Reinforcement Learning: Generic Algorithm and Robust Partial Coverage NIPS 2023 A Theoretical Analysis of Optimistic Proximal Policy Optimization in Linear Markov Decision Processes NIPS 2023 Provable Sim-to-real Transfer in Continuous Domain with Partial Observations ICLR 2023 Nearly Minimax Optimal Offline Reinforcement Learning with Linear Function Approximation: Single-Agent MDP and Markov Game ICLR 2023 Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets ICML 2022 A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Games ICML 2022 Nearly Optimal Policy Optimization with Stable at Any Time Guarantee ICML 2022 Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation ICML 2022 A Reduction-Based Framework for Conservative Bandits and Reinforcement Learning ICLR 2022 Why Robust Generalization in Deep Learning is Difficult: Perspective of Expressive Power NIPS 2022 Breaking the Moments Condition Barrier: No-Regret Algorithm for Bandits with Super Heavy-Tailed Payoffs NIPS 2021