Zhaoran Wang
131 papers · 2013–2025 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+16 more ↓ Show less ↑
πΊοΈ Taxonomy Completionist (23) π§ Keyword Pioneer π Interdisciplinary Bridge π Renaissance Researcher (5) π£ Hot Topic Early Bird
π
Renaissance Researcher
(5)
π
Interdisciplinary Bridge
π
Cross-Pollinator
(13)
π
Conference Loyalist
(50)
π€
Dynamic Duo
(96)
π
Triple Crown
π§¬
Topic Evolution
π
Keyword Champion
(2)
π¬
Deep Specialist
(34)
ποΈ
Keyword Collector
(87)
π
Conference Pioneer
β‘
Prolific Year
(19)
π₯
Unstoppable
(13)
β
The Questioner
(6)
π
Trend Setter
π
Century Club
(131)
Conferences
NIPS (50)
ICML (43)
ICLR (16)
AISTATS (9)
JMLR (7)
L4DC (3)
COLT (2)
EMNLP (1)
Top co-authors
Research topics
Keywords
regret bound
(19)
reinforcement learning
(18)
function approximation
(14)
neural network
(12)
offline reinforcement learning
(10)
multi-agent system
(9)
linear function approximation
(8)
policy optimization
(8)
markov decision process
(8)
high-dimensional statistics
(7)
multi-agent reinforcement learning
(7)
nash equilibrium
(7)
sample complexity
(6)
value iteration
(6)
deep reinforcement learning
(6)
policy gradient
(6)
nonconvex optimization
(6)
model-based reinforcement learning
(5)
upper confidence bound
(5)
representation learning
(5)
Papers
Reward-Augmented Data Enhances Direct Preference Alignment of LLMs
ICML 2025
BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning
ICML 2025
Toward Optimal LLM Alignments Using Two-Player Games
EMNLP 2025
An Instrumental Value for Data Production and its Application to Data Pricing
ICML 2025
The Sample Complexity of Online Strategic Decision Making with Information Asymmetry and Knowledge Transportability
ICML 2025
What and How does In-Context Learning Learn? Bayesian Model Averaging, Parameterization, and Generalization
AISTATS 2025
Are Transformers Able to Reason by Connecting Separated Knowledge in Training Data?
ICLR 2025
Sample-Efficient Multi-Agent RL: An Optimization Perspective
ICLR 2024
Let Models Speak Ciphers: Multiagent Debate through Embeddings
ICLR 2024
A General Framework for Sequential Decision-Making under Adaptivity Constraints
ICML 2024
How Does Goal Relabeling Improve Sample Efficiency?
ICML 2024
Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer
NIPS 2024
Instrumental Variable Value Iteration for Causal Offline Reinforcement Learning
JMLR 2024
Learning Regularized Graphon Mean-Field Games with Unknown Graphons
JMLR 2024
Reason for Future, Act for Now: A Principled Architecture for Autonomous LLM Agents
ICML 2024
Learning Dynamic Mechanisms in Unknown Environments: A Reinforcement Learning Approach
JMLR 2024
Adaptive-Gradient Policy Optimization: Enhancing Policy Learning in Non-Smooth Differentiable Simulations
ICML 2024
Learning Regularized Monotone Graphon Mean-Field Games
NIPS 2023
Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning
ICML 2023
Adaptive Barrier Smoothing for First-Order Policy Gradient with Contact Dynamics
ICML 2023
Enforcing Hard Constraints with Soft Barriers: Safe Reinforcement Learning in Unknown Stochastic Environments
ICML 2023
Achieving Hierarchy-Free Approximation for Bilevel Programs with Equilibrium Constraints
ICML 2023
Provably Efficient Generalized Lagrangian Policy Optimization for Safe Multi-Agent Reinforcement Learning
L4DC 2023
Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning
JMLR 2023
Can Reinforcement Learning Find Stackelberg-Nash Equilibria in General-Sum Markov Games with Myopically Rational Followers?
JMLR 2023
Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration
NIPS 2023
Posterior Sampling for Competitive RL: Function Approximation and Partial Observation
NIPS 2023
Model-Based Reparameterization Policy Gradient Methods: Theory and Practical Algorithms
NIPS 2023
Finding Regularized Competitive Equilibria of Heterogeneous Agent Macroeconomic Models via Reinforcement Learning
AISTATS 2023
Optimistic Exploration with Learned Features Provably Solves Markov Decision Processes with Neural Dynamics
ICLR 2023
Represent to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency
ICLR 2023
Latent Variable Representation for Reinforcement Learning
ICLR 2023
Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes
ICLR 2023
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization
ICLR 2023
Inducing Equilibria via Incentives: Simultaneous Design-and-Play Ensures Global Convergence
NIPS 2022
Relational Reasoning via Set Transformers: Provable Efficiency and Applications to MARL
NIPS 2022
Towards General Function Approximation in Zero-Sum Markov Games
ICLR 2022
Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets
ICML 2022
Welfare Maximization in Competitive Equilibrium: Reinforcement Learning for Markov Exchange Economy
ICML 2022
Provably Efficient Offline Reinforcement Learning for Partially Observable Markov Decision Processes
ICML 2022
Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning
ICLR 2022
Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation
ICML 2022
FinRL-Meta: Market Environments and Benchmarks for Data-Driven Financial Reinforcement Learning
NIPS 2022
Adaptive Model Design for Markov Decision Process
ICML 2022
A Unifying Framework of Off-Policy General Value Function Evaluation
NIPS 2022
Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning
ICML 2022
Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement Learning
ICML 2022
Learning from Demonstration: Provably Efficient Adversarial Policy Imitation with Linear Function Approximation
ICML 2022
Learn to Match with No Regret: Reinforcement Learning in Markov Matching Markets
NIPS 2022
RORL: Robust Offline Reinforcement Learning via Conservative Smoothing
NIPS 2022
Exponential Family Model-Based Reinforcement Learning via Score Matching
NIPS 2022
Gap-Dependent Bounds for Two-Player Markov Games
AISTATS 2022
Reinforcement Learning from Partial Observation: Linear Function Approximation with Provable Sample Efficiency
ICML 2022
Infinite-Dimensional Optimization for Zero-Sum Games via Variational Transport
ICML 2021
BooVI: Provably Efficient Bootstrapped Value Iteration
NIPS 2021
Wasserstein Flow Meets Replicator Dynamics: A Mean-Field Analysis of Representation Learning in Actor-Critic
NIPS 2021
Dynamic Bottleneck for Robust Self-Supervised Exploration
NIPS 2021
Exponential Bellman Equation and Improved Regret Bounds for Risk-Sensitive Reinforcement Learning
NIPS 2021
Pessimism Meets Invariance: Provably Efficient Offline Mean-Field Multi-Agent RL
NIPS 2021
Provably Efficient Causal Reinforcement Learning with Confounded Observational Data
NIPS 2021
Offline Constrained Multi-Objective Reinforcement Learning via Pessimistic Dual Value Iteration
NIPS 2021
A Near-Optimal Algorithm for Stochastic Bilevel Optimization via Double-Momentum
NIPS 2021
Is Pessimism Provably Efficient for Offline RL?
ICML 2021
Randomized Exploration in Reinforcement Learning with General Value Function Approximation
ICML 2021
Sample Elicitation
AISTATS 2021
Provably Eο¬cient Actor-Critic for Risk-Sensitive and Robust Adversarial RL: A Linear-Quadratic Case
AISTATS 2021
Provably Efficient Safe Exploration via Primal-Dual Policy Optimization
AISTATS 2021
Decentralized Single-Timescale Actor-Critic on Zero-Sum Two-Player Stochastic Games
ICML 2021
Risk-Sensitive Reinforcement Learning with Function Approximation: A Debiasing Approach
ICML 2021
Principled Exploration via Optimistic Bootstrapping and Backward Induction
ICML 2021
Learning While Playing in Mean-Field Games: Convergence and Optimality
ICML 2021
Provably Sample Efficient Reinforcement Learning in Competitive Linear Quadratic Systems
L4DC 2021
Global Convergence of Policy Gradient for Linear-Quadratic Mean-Field Control/Game in Continuous Time
ICML 2021
On Reward-Free RL with Kernel and Neural Function Approximations: Single-Agent MDP and Markov Game
ICML 2021
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
ICML 2021
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
ICLR 2021
Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions
ICML 2021
Actor-Critic Provably Finds Nash Equilibria of Linear-Quadratic Mean-Field Games
ICLR 2020
Dynamic Regret of Policy Optimization in Non-Stationary Environments
NIPS 2020
Pontryagin Differentiable Programming: An End-to-End Learning and Control Framework
NIPS 2020
Provably Efficient Neural Estimation of Structural Equation Models: An Adversarial Approach
NIPS 2020
Provably Efficient Neural GTD for Off-Policy Learning
NIPS 2020
Provably Efficient Reinforcement Learning with Kernel and Neural Function Approximations
NIPS 2020
Upper Confidence Primal-Dual Reinforcement Learning for CMDP with Adversarial Loss
NIPS 2020
End-to-End Learning and Intervention in Games
NIPS 2020
Can Temporal-Diο¬erence and Q-Learning Learn Representation? A Mean-Field Theory
NIPS 2020
Risk-Sensitive Reinforcement Learning: Near-Optimal Risk-Sample Tradeoff in Regret
NIPS 2020
Provably efficient reinforcement learning with linear function approximation
COLT 2020
Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium
COLT 2020
Neural Policy Gradient Methods: Global Optimality and Rates of Convergence
ICLR 2020
On Computation and Generalization of Generative Adversarial Imitation Learning
ICLR 2020
Provably Efficient Exploration in Policy Optimization
ICML 2020
Computational and Statistical Tradeoffs in Inferring Combinatorial Structures of Ising Model
ICML 2020
Semiparametric Nonlinear Bipartite Graph Representation Learning with Provable Guarantees
ICML 2020
Deep Reinforcement Learning with Robust and Smooth Policy
ICML 2020
On the Global Optimality of Model-Agnostic Meta-Learning
ICML 2020
Breaking the Curse of Many Agents: Provable Mean Embedding Q-Iteration for Mean-Field Reinforcement Learning
ICML 2020
Generative Adversarial Imitation Learning with Neural Network Parameterization: Global Optimality and Convergence Rate
ICML 2020
Agnostic Estimation for Phase Retrieval
JMLR 2020
A Theoretical Analysis of Deep Q-Learning
L4DC 2020
Provably Global Convergence of Actor-Critic: A Case for Linear Quadratic Regulator with Ergodic Cost
NIPS 2019
Variance Reduced Policy Evaluation with Smooth Function Approximation
NIPS 2019
Neural Trust Region/Proximal Policy Optimization Attains Globally Optimal Policy
NIPS 2019
Off-Policy Evaluation and Learning from Logged Bandit Feedback: Error Reduction via Surrogate Policy
ICLR 2019
ACCELERATING NONCONVEX LEARNING VIA REPLICA EXCHANGE LANGEVIN DIFFUSION
ICLR 2019
On the statistical rate of nonlinear recovery in generative models with heavy-tailed data
ICML 2019
High-dimensional Varying Index Coefficient Models via Stein's Identity
JMLR 2019
Statistical-Computational Tradeoff in Single Index Models
NIPS 2019
Convergent Policy Optimization for Safe Reinforcement Learning
NIPS 2019
Neural Temporal-Difference Learning Converges to Global Optima
NIPS 2019
The Edge Density Barrier: Computational-Statistical Tradeoffs in Combinatorial Inference
ICML 2018
Minimax-Optimal Privacy-Preserving Sparse PCA in Distributed Systems
AISTATS 2018
Nonlinear Structured Signal Estimation in High Dimensions via Iterative Hard Thresholding
AISTATS 2018
Contrastive Learning from Pairwise Measurements
NIPS 2018
Multi-Agent Reinforcement Learning via Double Averaging Primal-Dual Optimization
NIPS 2018
Provable Gaussian Embedding with One Observation
NIPS 2018
Estimating High-dimensional Non-Gaussian Multiple Index Models via Steinβs Lemma
NIPS 2017
Agnostic Estimation for Misspecified Phase Retrieval Models
NIPS 2016
Blind Attacks on Machine Learners
NIPS 2016
Online ICA: Understanding Global Dynamics of Nonconvex Optimization via Diffusion Processes
NIPS 2016
On the Statistical Limits of Convex Relaxations
ICML 2016
More Supervision, Less Computation: Statistical-Computational Tradeoffs in Weakly Supervised Learning
NIPS 2016
NESTT: A Nonconvex Primal-Dual Splitting Method for Distributed and Stochastic Optimization
NIPS 2016
Sparse Nonlinear Regression: Parameter Estimation under Nonconvexity
ICML 2016
A Nonconvex Optimization Framework for Low Rank Matrix Estimation
NIPS 2015
Non-convex Statistical Optimization for Sparse Tensor Graphical Model
NIPS 2015
Optimal Linear Estimation under Unknown Nonlinear Transform
NIPS 2015
High Dimensional EM Algorithm: Statistical Optimization and Asymptotic Normality
NIPS 2015
Tighten after Relax: Minimax-Optimal Sparse PCA in Polynomial Time
NIPS 2014
Sparse PCA with Oracle Property
NIPS 2014
Sparse Principal Component Analysis for High Dimensional Multivariate Time Series
AISTATS 2013