Paul Weng
18 papers · 2013–2025 · 6 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+7 more ↓ Show less ↑
🏃 Academic Marathon (12) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (6) 🐣 Hot Topic Early Bird
🏃
Academic Marathon
(12)
🐣
Hot Topic Early Bird
🐝
Cross-Pollinator
(7)
🧬
Topic Evolution
🗃️
Keyword Collector
(57)
💎
Century Club
(18)
🚀
Conference Pioneer
Conferences
ICML (8)
IJCAI (4)
AAAI (2)
ICLR (2)
ACML (1)
CORL (1)
Top co-authors
Keywords
multi-armed bandit
(3)
regret bound
(3)
fair policy
(2)
deep reinforcement learning
(2)
reward function
(2)
policy gradient
(2)
multi-objective optimization
(2)
offline reinforcement learning
(1)
epistemic uncertainty
(1)
sample efficiency
(1)
robotic manipulation
(1)
policy optimization
(1)
preference learning
(1)
policy learning
(1)
reinforcement learning
(1)
query selection
(1)
regret minimization
(1)
imitation learning
(1)
model-based reinforcement learning
(1)
reinforcement learning from human feedback
(1)
Papers
Comparing Comparisons: Informative and Easy Human Feedback with Distinguishability Queries
ICML 2025
DUO: Diverse, Uncertain, On-Policy Query Generation and Selection for Reinforcement Learning from Human Feedback
AAAI 2025
Enhancing Online Reinforcement Learning with Meta-Learned Objective from Offline Data
AAAI 2025
Reinforcement Learning from Imperfect Corrective Actions and Proxy Rewards
ICLR 2025
INViT: A Generalizable Routing Problem Solver with Invariant Nested View Transformer
ICML 2024
Revisiting Data Augmentation in Deep Reinforcement Learning
ICLR 2024
Solving Complex Manipulation Tasks with Model-Assisted Model-Free Reinforcement Learning
CORL 2022
CVaR-Regret Bounds for Multi-armed Bandits
ACML 2022
Neuro-Symbolic Hierarchical Rule Induction
ICML 2022
Learning Fair Policies in Decentralized Cooperative Multi-Agent Reinforcement Learning
ICML 2021
Learning Fair Policies in Multi-Objective (Deep) Reinforcement Learning with Average and Discounted Rewards
ICML 2020
Exploiting the Sign of the Advantage Function to Learn Deterministic Policies in Continuous Domains
IJCAI 2019
Multi-objective Bandits: Optimizing the Generalized Gini Index
ICML 2017
Optimization of Probabilistic Argumentation with Markov Decision Models
IJCAI 2015
Qualitative Multi-Armed Bandits: A Quantile-Based Approach
ICML 2015
Solving MDPs with Skew Symmetric Bilinear Utility Functions
IJCAI 2015
Interactive Value Iteration for Markov Decision Processes with Unknown Rewards
IJCAI 2013
Top-k Selection based on Adaptive Sampling of Noisy Preferences
ICML 2013