conftrace_

Yunhao Tang

45 papers · 2018–2025 · 8 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+14 more ↓

🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (10) 🌍 Conference Polyglot (8)

🌈 Renaissance Researcher (5) 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🏠 Conference Loyalist (24) 🤝 Dynamic Duo (23) 👑 Triple Crown 🏆 Keyword Champion (4) 🏆 Grand Slam 🔬 Deep Specialist (23) 💎 Century Club (45) ⚡ Prolific Year (8) 🗃️ Keyword Collector (154) 📈 Trend Setter 🔥 Unstoppable (8)

Conferences

ICML (24) AISTATS (8) NIPS (7) AAAI (2) CORL (1) ICLR (1) IJCAI (1) JMLR (1)

Top co-authors

Rémi Munos (23) Mark Rowland (19) Michal Valko (17) Will Dabney (10) Bernardo Avila Pires (7) Krzysztof Choromanski (7) Daniele Calandriello (6) Zhaohan Daniel Guo (6) Aldo Pacchiano (6) Bilal Piot (6)

Keywords

reinforcement learning (17) deep reinforcement learning (8) policy optimization (8) distributional reinforcement learning (6) variance reduction (6) policy gradient (6) off-policy learning (5) representation learning (4) value function (3) evolution strategy (3) blackbox optimization (3) multi-step learning (3) value estimation (2) off-policy evaluation (2) policy evaluation (2) spectral decomposition (2) gradient estimation (2) continuous control (2) self-supervised learning (2) sample complexity (2)

Papers

Optimizing Language Models for Inference Time Objectives using Reinforcement Learning ICML 2025 Categorical Distributional Reinforcement Learning with Kullback-Leibler Divergence: Convergence and Asymptotics ICML 2025 A Unifying Framework for Action-Conditional Self-Predictive Reinforcement Learning AISTATS 2025 On scalable oversight with weak LLMs judging strong LLMs NIPS 2024 Near-Minimax-Optimal Distributional Reinforcement Learning with a Generative Model NIPS 2024 Generalized Preference Optimization: A Unified Approach to Offline Alignment ICML 2024 Nash Learning from Human Feedback ICML 2024 Learning Uncertainty-Aware Temporally-Extended Actions AAAI 2024 Human Alignment of Large Language Models through Online Preference Optimisation ICML 2024 An Analysis of Quantile Temporal-Difference Learning JMLR 2024 A Distributional Analogue to the Successor Representation ICML 2024 Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition ICML 2023 Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice ICML 2023 Quantile Credit Assignment ICML 2023 The Edge of Orthogonality: A Simple View of What Makes BYOL Tick ICML 2023 The Statistical Benefits of Quantile Temporal-Difference Learning for Value Estimation ICML 2023 Understanding Self-Predictive Learning for Reinforcement Learning ICML 2023 DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm ICML 2023 Towards a better understanding of representation dynamics under TD-learning ICML 2023 VA-learning as a more efficient alternative to Q-learning ICML 2023 Fast Rates for Maximum Entropy Exploration ICML 2023 From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses ICML 2022 Marginalized Operators for Off-policy Reinforcement Learning AISTATS 2022 BYOL-Explore: Exploration by Bootstrapped Prediction NIPS 2022 The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning NIPS 2022 Biased Gradient Estimate with Drastic Variance Reduction for Meta Reinforcement Learning ICML 2022 Hindsight Expectation Maximization for Goal-conditioned Reinforcement Learning AISTATS 2021 Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation NIPS 2021 Revisiting Peng’s Q($λ$) for Modern Reinforcement Learning ICML 2021 Taylor Expansion of Discount Factors ICML 2021 Learning to Score Behaviors for Guided Policy Optimization ICML 2020 Taylor Expansion Policy Optimization ICML 2020 Self-Imitation Learning via Generalized Lower Bound Q-learning NIPS 2020 Discretizing Continuous Action Space for On-Policy Optimization AAAI 2020 Practical Nonisotropic Monte Carlo Sampling in High Dimensions via Determinantal Point Processes AISTATS 2020 Discrete Action On-Policy Learning with Action-Value Critic AISTATS 2020 ES-MAML: Simple Hessian-Free Meta Learning ICLR 2020 Monte-Carlo Tree Search as Regularized Policy Optimization ICML 2020 Reinforcement Learning for Integer Programming: Learning to Cut ICML 2020 Variance Reduction for Evolution Strategies via Structured Control Variates AISTATS 2020 Provably Robust Blackbox Optimization for Reinforcement Learning CORL 2019 Orthogonal Estimation of Wasserstein Distances AISTATS 2019 From Complexity to Simplicity: Adaptive ES-Active Subspaces for Blackbox Optimization NIPS 2019 KAMA-NNs: Low-dimensional Rotation Based Neural Networks AISTATS 2019 Exploration by Distributional Reinforcement Learning IJCAI 2018