Yaodong Yang
99 papers · 2018–2026 · 13 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+15 more ↓ Show less ↑
πΊοΈ Taxonomy Completionist (17) π§ Keyword Pioneer π Renaissance Researcher (6) π Interdisciplinary Bridge π£ Hot Topic Early Bird
π
Cross-Pollinator
(14)
πΊοΈ
Taxonomy Completionist
(17)
π
Renaissance Researcher
(6)
π
Conference Loyalist
(24)
π¬
Deep Specialist
(30)
π
Triple Crown
π§¬
Topic Evolution
π
Keyword Champion
(3)
π
Grand Slam
π₯
Mega-Team
(35)
π€
Dynamic Duo
(22)
π₯
Unstoppable
(8)
β‘
Prolific Year
(12)
π
Century Club
(94)
ποΈ
Keyword Collector
(57)
Conferences
NIPS (24)
AAAI (15)
ICLR (15)
ICML (15)
ACL (10)
CORL (5)
IJCAI (5)
JMLR (5)
CVPR (1)
EMNLP (1)
ICCV (1)
NAACL (1)
WACV (1)
Top co-authors
Keywords
multi-agent reinforcement learning
(28)
large language model
(14)
nash equilibrium
(10)
multi-agent system
(10)
game theory
(10)
policy optimization
(6)
reinforcement learning
(6)
reinforcement learning from human feedback
(6)
safe reinforcement learning
(5)
policy learning
(4)
preference optimization
(4)
human preference
(4)
fictitious play
(4)
deep reinforcement learning
(4)
determinantal point process
(3)
transformer architecture
(3)
offline reinforcement learning
(3)
responsible ai
(3)
multi-agent learning
(3)
preference learning
(3)
Papers
SafeMCP: Proactive Power Regulation for LLM Agent Defense via Environment-Grounded Look-Ahead Reasoning
ACL 2026
A Game-Theoretica Negotiation Framework for Cross-Cultural Consensus
ACL 2026
DexGraspVLA: A Vision-Language-Action Framework Towards General Dexterous Grasping
AAAI 2026
SafeMT: Multi-turn Safety for Multimodal Language Models
ACL 2026
Communication-Efficient Desire Alignment for Proactive Embodied HumanβAgent Interaction
ACL 2026
Differentiable Information Enhanced Model-Based Reinforcement Learning
AAAI 2025
Towards Efficient Collaboration via Graph Modeling in Reinforcement Learning
AAAI 2025
RAT: Adversarial Attacks on Deep Reinforcement Agents for Targeted Behaviors
AAAI 2025
Falcon: Fast Visuomotor Policies via Partial Denoising
ICML 2025
In-Context Editing: Learning Knowledge from Self-Induced Distributions
ICLR 2025
Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs
ICLR 2025
Mitigating Reward Over-Optimization in RLHF via Behavior-Supported Regularization
ICLR 2025
Emerging Safety Attack and Defense in Federated Instruction Tuning of Large Language Models
ICLR 2025
Magnetic Preference Optimization: Achieving Last-iterate Convergence for Language Model Alignment
ICLR 2025
Enhancing LLM-Based Social Bot via an Adversarial Learning Framework
EMNLP 2025
Benchmarking Multi-National Value Alignment for Large Language Models
ACL 2025
Reward Generalization in RLHF: A Topological Perspective
ACL 2025
SafeLawBench: Towards Safe Alignment of Large Language Models
ACL 2025
Boosting Policy and Process Reward Models with Monte Carlo Tree Search in Open-Domain QA
ACL 2025
PKU-SafeRLHF: Towards Multi-Level Safety Alignment for LLMs with Human Preference
ACL 2025
Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability
NAACL 2025
Language Models Resist Alignment: Evidence From Data Compression
ACL 2025
SAE-V: Interpreting Multimodal Models for Enhanced Alignment
ICML 2025
ClutterDexGrasp: A Sim-to-Real System for General Dexterous Grasping in Cluttered Scenes
CORL 2025
Sequence to Sequence Reward Modeling: Improving RLHF by Language Feedback
AAAI 2025
Stream Aligner: Efficient Sentence-Level Alignment via Distribution Induction
AAAI 2025
Off-Agent Trust Region Policy Optimization
IJCAI 2024
ProgressGym: Alignment with a Millennium of Moral Progress
NIPS 2024
SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset
NIPS 2024
Panacea: Pareto Alignment via Preference Adaptation for LLMs
NIPS 2024
Aligner: Efficient Alignment by Learning to Correct
NIPS 2024
Scalable Constrained Policy Optimization for Safe Multi-agent Reinforcement Learning
NIPS 2024
Object-Centric Dexterous Manipulation from Human Motion Data
CORL 2024
Neural Attention Field: Emerging Point Relevance in 3D Scenes for One-Shot Dexterous Grasping
CORL 2024
A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning
AAAI 2024
STAS: Spatial-Temporal Return Decomposition for Solving Sparse Rewards Problems in Multi-agent Reinforcement Learning
AAAI 2024
ProAgent: Building Proactive Cooperative Agents with Large Language Models
AAAI 2024
AnySkill: Learning Open-Vocabulary Physical Skill for Interactive Agents
CVPR 2024
Safe RLHF: Safe Reinforcement Learning from Human Feedback
ICLR 2024
SafeDreamer: Safe Reinforcement Learning with World Models
ICLR 2024
Maximum Entropy Heterogeneous-Agent Reinforcement Learning
ICLR 2024
CivRealm: A Learning and Reasoning Odyssey in Civilization for Decision-Making Agents
ICLR 2024
Byzantine Robust Cooperative Multi-Agent Reinforcement Learning as a Bayesian Game
ICLR 2024
Safe Reinforcement Learning using Finite-Horizon Gradient-based Estimation
ICML 2024
Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning
ICML 2024
End-to-End Neuro-Symbolic Reinforcement Learning with Textual Explanations
ICML 2024
Sample-Efficient Multiagent Reinforcement Learning with Reset Replay
ICML 2024
Heterogeneous-Agent Reinforcement Learning
JMLR 2024
OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research
JMLR 2024
DPPMask: Masked Image Modeling With Determinantal Point Processes
WACV 2024
Learning to Shape Rewards Using a Game of Two Partners
AAAI 2023
BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset
NIPS 2023
MANSA: Learning Fast and Slow in Multi-Agent Systems
ICML 2023
A Game-Theoretic Framework for Managing Risk in Multi-Agent Systems
ICML 2023
Regret-Minimizing Double Oracle for Extensive-Form Games
ICML 2023
GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models
ICML 2023
MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning
JMLR 2023
Subspace-Aware Exploration for Sparse-Reward Multi-Agent Tasks
AAAI 2023
MARLlib: A Scalable and Efficient Multi-agent Reinforcement Learning Library
JMLR 2023
TorchOpt: An Efficient Library for Differentiable Optimization
JMLR 2023
Team-PSRO for Learning Approximate TMECor in Large Team Games via Cooperative Reinforcement Learning
NIPS 2023
Hierarchical Multi-Agent Skill Discovery
NIPS 2023
Safety Gymnasium: A Unified Safe Reinforcement Learning Benchmark
NIPS 2023
UniDexGrasp++: Improving Dexterous Grasping Policy Learning via Geometry-Aware Curriculum and Iterative Generalist-Specialist Learning
ICCV 2023
Multi-Agent First Order Constrained Optimization in Policy Space
NIPS 2023
Quality-Similar Diversity via Population Based Reinforcement Learning
ICLR 2023
Boosting Multiagent Reinforcement Learning via Permutation Invariant and Permutation Equivariant Networks
ICLR 2023
Policy Space Diversity for Non-Transitive Games
NIPS 2023
Dynamic Handover: Throw and Catch with Bimanual Hands
CORL 2023
ACE: Cooperative Multi-Agent Q-learning with Bidirectional Action-Dependency
AAAI 2023
On the Convergence of Fictitious Play: A Decomposition Approach
IJCAI 2022
Trust Region Policy Optimisation in Multi-Agent Reinforcement Learning
ICLR 2022
LIGS: Learnable Intrinsic-Reward Generation Selection for Multi-Agent Learning
ICLR 2022
Transformer-based Working Memory for Multiagent Reinforcement Learning with Action Parsing
NIPS 2022
A Theoretical Understanding of Gradient Bias in Meta-Reinforcement Learning
NIPS 2022
MATE: Benchmarking Multi-Agent Reinforcement Learning in Distributed Target Coverage Control
NIPS 2022
Meta-Reward-Net: Implicitly Differentiable Reward Learning for Preference-based Reinforcement Learning
NIPS 2022
Constrained Update Projection Approach to Safe Policy Optimization
NIPS 2022
Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning
NIPS 2022
Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
NIPS 2022
A Unified Diversity Measure for Multiagent Reinforcement Learning
NIPS 2022
What about Inputting Policy in Value Function: Policy Representation and Policy-Extended Value Function Approximator
AAAI 2022
Towards Unifying Behavioral and Response Diversity for Open-ended Learning in Zero-sum Games
NIPS 2021
Neural Auto-Curricula in Two-Player Zero-Sum Games
NIPS 2021
Foresee then Evaluate: Decomposing Value Estimation with Latent Future Prediction
AAAI 2021
Modelling Behavioural Diversity for Learning in Open-Ended Games
ICML 2021
Learning in Nonzero-Sum Stochastic Games with Potentials
ICML 2021
Settling the Variance of Multi-Agent Policy Gradients
NIPS 2021
SMARTS: An Open-Source Scalable Multi-Agent RL Training School for Autonomous Driving
CORL 2020
Replica-Exchange Nos\'e-Hoover Dynamics for Bayesian Learning on Large Datasets
NIPS 2020
Modelling Bounded Rationality in Multi-Agent Interactions by Generalized Recursive Reasoning
IJCAI 2020
Multi-Agent Determinantal Q-Learning
ICML 2020
Bi-Level Actor-Critic for Multi-Agent Coordination
AAAI 2020
Q-value Path Decomposition for Deep Multiagent Reinforcement Learning
ICML 2020
Probabilistic Recursive Reasoning for Multi-Agent Reinforcement Learning
ICLR 2019
Large-Scale Home Energy Management Using Entropy-Based Collective Multiagent Deep Reinforcement Learning Framework
IJCAI 2019
Mean Field Multi-Agent Reinforcement Learning
ICML 2018
Thermostat-assisted continuously-tempered Hamiltonian Monte Carlo for Bayesian learning
NIPS 2018
Recurrent Deep Multiagent Q-Learning for Autonomous Brokers in Smart Grid
IJCAI 2018