Zhuoran Yang
129 papers · 2015–2026 · 10 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+16 more ↓ Show less ↑
πΊοΈ Taxonomy Completionist (22) π§ Keyword Pioneer π Interdisciplinary Bridge π Renaissance Researcher (5) π Conference Polyglot (10)
π
Renaissance Researcher
(5)
π
Interdisciplinary Bridge
πΊοΈ
Taxonomy Completionist
(22)
π
Conference Loyalist
(41)
π€
Dynamic Duo
(96)
π
Triple Crown
π¬
Deep Specialist
(33)
π§¬
Topic Evolution
π
Keyword Champion
(15)
π₯
Unstoppable
(11)
β‘
Prolific Year
(14)
β
The Questioner
(7)
π
Trend Setter
π
Century Club
(128)
ποΈ
Keyword Collector
(92)
π
Conference Pioneer
Conferences
ICML (47)
NIPS (41)
ICLR (17)
JMLR (8)
AISTATS (7)
L4DC (3)
ACL (2)
COLT (2)
CORL (1)
ICCV (1)
Top co-authors
Research topics
Keywords
regret bound
(19)
reinforcement learning
(18)
function approximation
(15)
neural network
(12)
offline reinforcement learning
(10)
policy optimization
(9)
multi-agent system
(9)
markov decision process
(9)
nash equilibrium
(8)
linear function approximation
(8)
multi-agent reinforcement learning
(8)
representation learning
(8)
sample complexity
(7)
zero-sum game
(6)
value iteration
(6)
bilevel optimization
(5)
causal inference
(4)
partially observable markov decision process
(4)
exponential family
(4)
policy gradient
(4)
Papers
Probing Audio-Visual Reasoning in Multimodal Language Models through the Lens of Audio
ACL 2026
The Sample Complexity of Online Strategic Decision Making with Information Asymmetry and Knowledge Transportability
ICML 2025
BanditSpec: Adaptive Speculative Decoding via Bandit Algorithms
ICML 2025
In-Context Linear Regression Demystified: Training Dynamics and Mechanistic Interpretability of Multi-Head Softmax Attention
ICML 2025
Can Neural Networks Achieve Optimal Computational-statistical Tradeoff? An Analysis on Single-Index Model
ICLR 2025
In-Context Reinforcement Learning From Suboptimal Historical Data
ICML 2025
Decoding Rewards in Competitive Games: Inverse Game Theory with Entropy Regularization
ICML 2025
Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF
JMLR 2025
InstaDrive: Instance-Aware Driving World Models for Realistic and Consistent Video Generation
ICCV 2025
What and How does In-Context Learning Learn? Bayesian Model Averaging, Parameterization, and Generalization
AISTATS 2025
Learning Task Representations from In-Context Learning
ACL 2025
Reflective Planning: Vision-Language Models for Multi-Stage Long-Horizon Robotic Manipulation
CORL 2025
An Instrumental Value for Data Production and its Application to Data Pricing
ICML 2025
On the Role of Information Structure in Reinforcement Learning for Partially-Observable Sequential Teams and Games
NIPS 2024
Unveiling Induction Heads: Provable Training Dynamics and Feature Learning in Transformers
NIPS 2024
Learning Dynamic Mechanisms in Unknown Environments: A Reinforcement Learning Approach
JMLR 2024
Learning Regularized Graphon Mean-Field Games with Unknown Graphons
JMLR 2024
Instrumental Variable Value Iteration for Causal Offline Reinforcement Learning
JMLR 2024
How Does Goal Relabeling Improve Sample Efficiency?
ICML 2024
Mean Field Langevin Actor-Critic: Faster Convergence and Global Optimality beyond Lazy Learning
ICML 2024
A General Framework for Sequential Decision-Making under Adaptivity Constraints
ICML 2024
Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF
ICML 2024
From Words to Actions: Unveiling the Theoretical Underpinnings of LLM-Driven Autonomous Systems
ICML 2024
Theory of Consistency Diffusion Models: Distribution Estimation Meets Fast Sampling
ICML 2024
Sample-Efficient Multi-Agent RL: An Optimization Perspective
ICLR 2024
Sample-efficient Learning of Infinite-horizon Average-reward MDPs with General Function Approximation
ICLR 2024
Symmetric Mean-field Langevin Dynamics for Distributional Minimax Problems
ICLR 2024
Finding Regularized Competitive Equilibria of Heterogeneous Agent Macroeconomic Models via Reinforcement Learning
AISTATS 2023
Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration
NIPS 2023
Posterior Sampling for Competitive RL: Function Approximation and Partial Observation
NIPS 2023
Online Performative Gradient Descent for Learning Nash Equilibria in Decision-Dependent Games
NIPS 2023
Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning
NIPS 2023
Learning Regularized Monotone Graphon Mean-Field Games
NIPS 2023
Decentralized Optimistic Hyperpolicy Mirror Descent: Provably No-Regret Learning in Markov Games
ICLR 2023
Optimistic Exploration with Learned Features Provably Solves Markov Decision Processes with Neural Dynamics
ICLR 2023
Represent to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency
ICLR 2023
Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes
ICLR 2023
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization
ICLR 2023
Can We Find Nash Equilibria at a Linear Rate in Markov Games?
ICLR 2023
Learning to Incentivize Information Acquisition: Proper Scoring Rules Meet Principal-Agent Model
ICML 2023
Provably Efficient Representation Learning with Tractable Planning in Low-Rank POMDP
ICML 2023
Enforcing Hard Constraints with Soft Barriers: Safe Reinforcement Learning in Unknown Stochastic Environments
ICML 2023
Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning
ICML 2023
Can Reinforcement Learning Find Stackelberg-Nash Equilibria in General-Sum Markov Games with Myopically Rational Followers?
JMLR 2023
Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning
JMLR 2023
Provably Efficient Generalized Lagrangian Policy Optimization for Safe Multi-Agent Reinforcement Learning
L4DC 2023
Welfare Maximization in Competitive Equilibrium: Reinforcement Learning for Markov Exchange Economy
ICML 2022
Inducing Equilibria via Incentives: Simultaneous Design-and-Play Ensures Global Convergence
NIPS 2022
Provably Efficient Offline Reinforcement Learning for Partially Observable Markov Decision Processes
ICML 2022
Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation
ICML 2022
Adaptive Model Design for Markov Decision Process
ICML 2022
Reinforcement Learning from Partial Observation: Linear Function Approximation with Provable Sample Efficiency
ICML 2022
Reinforcement Learning with Logarithmic Regret and Policy Switches
NIPS 2022
Exponential Family Model-Based Reinforcement Learning via Score Matching
NIPS 2022
Learn to Match with No Regret: Reinforcement Learning in Markov Matching Markets
NIPS 2022
A Unifying Framework of Off-Policy General Value Function Evaluation
NIPS 2022
Gap-Dependent Bounds for Two-Player Markov Games
AISTATS 2022
Towards General Function Approximation in Zero-Sum Markov Games
ICLR 2022
Reinforcement Learning under a Multi-agent Predictive State Representation Model: Method and Theory
ICLR 2022
Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning
ICLR 2022
Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets
ICML 2022
Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning
ICML 2022
Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement Learning
ICML 2022
Relational Reasoning via Set Transformers: Provable Efficiency and Applications to MARL
NIPS 2022
Learning from Demonstration: Provably Efficient Adversarial Policy Imitation with Linear Function Approximation
ICML 2022
BooVI: Provably Efficient Bootstrapped Value Iteration
NIPS 2021
Pessimism Meets Invariance: Provably Efficient Offline Mean-Field Multi-Agent RL
NIPS 2021
Wasserstein Flow Meets Replicator Dynamics: A Mean-Field Analysis of Representation Learning in Actor-Critic
NIPS 2021
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
ICLR 2021
Provably Sample Efficient Reinforcement Learning in Competitive Linear Quadratic Systems
L4DC 2021
Provably Efficient Safe Exploration via Primal-Dual Policy Optimization
AISTATS 2021
Risk-Sensitive Reinforcement Learning with Function Approximation: A Debiasing Approach
ICML 2021
Decentralized Single-Timescale Actor-Critic on Zero-Sum Two-Player Stochastic Games
ICML 2021
Randomized Exploration in Reinforcement Learning with General Value Function Approximation
ICML 2021
Is Pessimism Provably Efficient for Offline RL?
ICML 2021
Infinite-Dimensional Optimization for Zero-Sum Games via Variational Transport
ICML 2021
Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions
ICML 2021
On Reward-Free RL with Kernel and Neural Function Approximations: Single-Agent MDP and Markov Game
ICML 2021
Reinforcement Learning for Cost-Aware Markov Decision Processes
ICML 2021
Global Convergence of Policy Gradient for Linear-Quadratic Mean-Field Control/Game in Continuous Time
ICML 2021
Learning While Playing in Mean-Field Games: Convergence and Optimality
ICML 2021
Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality
ICML 2021
A Near-Optimal Algorithm for Stochastic Bilevel Optimization via Double-Momentum
NIPS 2021
Offline Constrained Multi-Objective Reinforcement Learning via Pessimistic Dual Value Iteration
NIPS 2021
Provably Efficient Causal Reinforcement Learning with Confounded Observational Data
NIPS 2021
Exponential Bellman Equation and Improved Regret Bounds for Risk-Sensitive Reinforcement Learning
NIPS 2021
Sample Elicitation
AISTATS 2021
Provably Eο¬cient Actor-Critic for Risk-Sensitive and Robust Adversarial RL: A Linear-Quadratic Case
AISTATS 2021
Pontryagin Differentiable Programming: An End-to-End Learning and Control Framework
NIPS 2020
Dynamic Regret of Policy Optimization in Non-Stationary Environments
NIPS 2020
Upper Confidence Primal-Dual Reinforcement Learning for CMDP with Adversarial Loss
NIPS 2020
On Computation and Generalization of Generative Adversarial Imitation Learning
ICLR 2020
Semiparametric Nonlinear Bipartite Graph Representation Learning with Provable Guarantees
ICML 2020
Provably Efficient Exploration in Policy Optimization
ICML 2020
Can Temporal-Diο¬erence and Q-Learning Learn Representation? A Mean-Field Theory
NIPS 2020
Risk-Sensitive Reinforcement Learning: Near-Optimal Risk-Sample Tradeoff in Regret
NIPS 2020
Neural Policy Gradient Methods: Global Optimality and Rates of Convergence
ICLR 2020
Actor-Critic Provably Finds Nash Equilibria of Linear-Quadratic Mean-Field Games
ICLR 2020
Provably efficient reinforcement learning with linear function approximation
COLT 2020
Generative Adversarial Imitation Learning with Neural Network Parameterization: Global Optimality and Convergence Rate
ICML 2020
Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium
COLT 2020
Breaking the Curse of Many Agents: Provable Mean Embedding Q-Iteration for Mean-Field Reinforcement Learning
ICML 2020
On the Global Optimality of Model-Agnostic Meta-Learning
ICML 2020
Robust One-Bit Recovery via ReLU Generative Networks: Near-Optimal Statistical Rate and Global Landscape Analysis
ICML 2020
A Theoretical Analysis of Deep Q-Learning
L4DC 2020
Provably Efficient Reinforcement Learning with Kernel and Neural Function Approximations
NIPS 2020
Provably Efficient Neural GTD for Off-Policy Learning
NIPS 2020
Provably Efficient Neural Estimation of Structural Equation Models: An Adversarial Approach
NIPS 2020
High-dimensional Varying Index Coefficient Models via Stein's Identity
JMLR 2019
Policy Optimization Provably Converges to Nash Equilibria in Zero-Sum Linear Quadratic Games
NIPS 2019
Neural Trust Region/Proximal Policy Optimization Attains Globally Optimal Policy
NIPS 2019
Variance Reduced Policy Evaluation with Smooth Function Approximation
NIPS 2019
Statistical-Computational Tradeoff in Single Index Models
NIPS 2019
Convergent Policy Optimization for Safe Reinforcement Learning
NIPS 2019
On the statistical rate of nonlinear recovery in generative models with heavy-tailed data
ICML 2019
Neural Temporal-Difference Learning Converges to Global Optima
NIPS 2019
Provably Global Convergence of Actor-Critic: A Case for Linear Quadratic Regulator with Ergodic Cost
NIPS 2019
The Edge Density Barrier: Computational-Statistical Tradeoffs in Combinatorial Inference
ICML 2018
Multi-Agent Reinforcement Learning via Double Averaging Primal-Dual Optimization
NIPS 2018
On Semiparametric Exponential Family Graphical Models
JMLR 2018
Contrastive Learning from Pairwise Measurements
NIPS 2018
Nonlinear Structured Signal Estimation in High Dimensions via Iterative Hard Thresholding
AISTATS 2018
Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents
ICML 2018
Provable Gaussian Embedding with One Observation
NIPS 2018
Estimating High-dimensional Non-Gaussian Multiple Index Models via Steinβs Lemma
NIPS 2017
High-dimensional Non-Gaussian Single Index Models via Thresholded Score Function Estimation
ICML 2017
Sparse Nonlinear Regression: Parameter Estimation under Nonconvexity
ICML 2016
More Supervision, Less Computation: Statistical-Computational Tradeoffs in Weakly Supervised Learning
NIPS 2016
Human Memory Search as Initial-Visit Emitting Random Walk
NIPS 2015