Paul Weng

18 papers · 2013–2025 · 6 conferences · across top CS/AI conferences

Achievements

+7 more ↓

🏃 Academic Marathon (12) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (6) 🐣 Hot Topic Early Bird

🏃 Academic Marathon (12) 🐣 Hot Topic Early Bird 🐝 Cross-Pollinator (7) 🧬 Topic Evolution 🗃️ Keyword Collector (57) 💎 Century Club (18) 🚀 Conference Pioneer

Conferences

ICML (8) IJCAI (4) AAAI (2) ICLR (2) ACML (1) CORL (1)

Top co-authors

Zhaohui Jiang (4) Matthieu Zimmer (4) Xuening Feng (4) Róbert Busa-Fekete (3) Balázs Szörényi (3) Yifei Zhu (3) Eyke Hüllermeier (3) Claire Glanois (2) Timo Kaufmann (2) Umer Siddique (2)

Keywords

multi-armed bandit (3) regret bound (3) fair policy (2) deep reinforcement learning (2) reward function (2) policy gradient (2) multi-objective optimization (2) offline reinforcement learning (1) epistemic uncertainty (1) sample efficiency (1) robotic manipulation (1) policy optimization (1) preference learning (1) policy learning (1) reinforcement learning (1) query selection (1) regret minimization (1) imitation learning (1) model-based reinforcement learning (1) reinforcement learning from human feedback (1)

Papers

Comparing Comparisons: Informative and Easy Human Feedback with Distinguishability Queries ICML 2025 DUO: Diverse, Uncertain, On-Policy Query Generation and Selection for Reinforcement Learning from Human Feedback AAAI 2025 Enhancing Online Reinforcement Learning with Meta-Learned Objective from Offline Data AAAI 2025 Reinforcement Learning from Imperfect Corrective Actions and Proxy Rewards ICLR 2025 INViT: A Generalizable Routing Problem Solver with Invariant Nested View Transformer ICML 2024 Revisiting Data Augmentation in Deep Reinforcement Learning ICLR 2024 Solving Complex Manipulation Tasks with Model-Assisted Model-Free Reinforcement Learning CORL 2022 CVaR-Regret Bounds for Multi-armed Bandits ACML 2022 Neuro-Symbolic Hierarchical Rule Induction ICML 2022 Learning Fair Policies in Decentralized Cooperative Multi-Agent Reinforcement Learning ICML 2021 Learning Fair Policies in Multi-Objective (Deep) Reinforcement Learning with Average and Discounted Rewards ICML 2020 Exploiting the Sign of the Advantage Function to Learn Deterministic Policies in Continuous Domains IJCAI 2019 Multi-objective Bandits: Optimizing the Generalized Gini Index ICML 2017 Optimization of Probabilistic Argumentation with Markov Decision Models IJCAI 2015 Qualitative Multi-Armed Bandits: A Quantile-Based Approach ICML 2015 Solving MDPs with Skew Symmetric Bilinear Utility Functions IJCAI 2015 Interactive Value Iteration for Markov Decision Processes with Unknown Rewards IJCAI 2013 Top-k Selection based on Adaptive Sampling of Noisy Preferences ICML 2013