Yunhao Tang
45 papers · 2018–2025 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+14 more ↓ Show less ↑
π§ Keyword Pioneer π£ Hot Topic Early Bird π Interdisciplinary Bridge πΊοΈ Taxonomy Completionist (10) π Conference Polyglot (8)
π
Renaissance Researcher
(5)
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π
Conference Loyalist
(24)
π€
Dynamic Duo
(23)
π
Triple Crown
π
Keyword Champion
(4)
π
Grand Slam
π¬
Deep Specialist
(23)
π
Century Club
(45)
β‘
Prolific Year
(8)
ποΈ
Keyword Collector
(154)
π
Trend Setter
π₯
Unstoppable
(8)
Conferences
ICML (24)
AISTATS (8)
NIPS (7)
AAAI (2)
CORL (1)
ICLR (1)
IJCAI (1)
JMLR (1)
Top co-authors
Keywords
reinforcement learning
(17)
deep reinforcement learning
(8)
policy optimization
(8)
distributional reinforcement learning
(6)
variance reduction
(6)
policy gradient
(6)
off-policy learning
(5)
representation learning
(4)
value function
(3)
evolution strategy
(3)
blackbox optimization
(3)
multi-step learning
(3)
value estimation
(2)
off-policy evaluation
(2)
policy evaluation
(2)
spectral decomposition
(2)
gradient estimation
(2)
continuous control
(2)
self-supervised learning
(2)
sample complexity
(2)
Papers
Optimizing Language Models for Inference Time Objectives using Reinforcement Learning
ICML 2025
Categorical Distributional Reinforcement Learning with Kullback-Leibler Divergence: Convergence and Asymptotics
ICML 2025
A Unifying Framework for Action-Conditional Self-Predictive Reinforcement Learning
AISTATS 2025
On scalable oversight with weak LLMs judging strong LLMs
NIPS 2024
Near-Minimax-Optimal Distributional Reinforcement Learning with a Generative Model
NIPS 2024
Generalized Preference Optimization: A Unified Approach to Offline Alignment
ICML 2024
Nash Learning from Human Feedback
ICML 2024
Learning Uncertainty-Aware Temporally-Extended Actions
AAAI 2024
Human Alignment of Large Language Models through Online Preference Optimisation
ICML 2024
An Analysis of Quantile Temporal-Difference Learning
JMLR 2024
A Distributional Analogue to the Successor Representation
ICML 2024
Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition
ICML 2023
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice
ICML 2023
Quantile Credit Assignment
ICML 2023
The Edge of Orthogonality: A Simple View of What Makes BYOL Tick
ICML 2023
The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation
ICML 2023
Understanding Self-Predictive Learning for Reinforcement Learning
ICML 2023
DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm
ICML 2023
Towards a better understanding of representation dynamics under TD-learning
ICML 2023
VA-learning as a more efficient alternative to Q-learning
ICML 2023
Fast Rates for Maximum Entropy Exploration
ICML 2023
From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses
ICML 2022
Marginalized Operators for Off-policy Reinforcement Learning
AISTATS 2022
BYOL-Explore: Exploration by Bootstrapped Prediction
NIPS 2022
The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning
NIPS 2022
Biased Gradient Estimate with Drastic Variance Reduction for Meta Reinforcement Learning
ICML 2022
Hindsight Expectation Maximization for Goal-conditioned Reinforcement Learning
AISTATS 2021
Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation
NIPS 2021
Revisiting Pengβs Q($Ξ»$) for Modern Reinforcement Learning
ICML 2021
Taylor Expansion of Discount Factors
ICML 2021
Learning to Score Behaviors for Guided Policy Optimization
ICML 2020
Taylor Expansion Policy Optimization
ICML 2020
Self-Imitation Learning via Generalized Lower Bound Q-learning
NIPS 2020
Discretizing Continuous Action Space for On-Policy Optimization
AAAI 2020
Practical Nonisotropic Monte Carlo Sampling in High Dimensions via Determinantal Point Processes
AISTATS 2020
Discrete Action On-Policy Learning with Action-Value Critic
AISTATS 2020
ES-MAML: Simple Hessian-Free Meta Learning
ICLR 2020
Monte-Carlo Tree Search as Regularized Policy Optimization
ICML 2020
Reinforcement Learning for Integer Programming: Learning to Cut
ICML 2020
Variance Reduction for Evolution Strategies via Structured Control Variates
AISTATS 2020
Provably Robust Blackbox Optimization for Reinforcement Learning
CORL 2019
Orthogonal Estimation of Wasserstein Distances
AISTATS 2019
From Complexity to Simplicity: Adaptive ES-Active Subspaces for Blackbox Optimization
NIPS 2019
KAMA-NNs: Low-dimensional Rotation Based Neural Networks
AISTATS 2019
Exploration by Distributional Reinforcement Learning
IJCAI 2018