Xuehai Pan
8 papers · 2022–2025 · 4 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+1 more ↓ Show less ↑
π Interdisciplinary Bridge π§ Keyword Pioneer π Conference Polyglot (4) π Cross-Pollinator (13) πΊοΈ Taxonomy Completionist (19)
π£
Hot Topic Early Bird
Conferences
NIPS (4)
ICLR (2)
ACL (1)
JMLR (1)
Top co-authors
Keywords
reinforcement learning from human feedback
(3)
safe reinforcement learning
(2)
large language model
(2)
preference learning
(2)
constraint optimization
(1)
knowledge distillation
(1)
content moderation
(1)
policy learning
(1)
language model alignment
(1)
ai safety
(1)
safety alignment
(1)
risk minimization
(1)
constraint satisfaction
(1)
human feedback
(1)
bayesian network
(1)
hallucination reduction
(1)
safety benchmark
(1)
alignment method
(1)
model-agnostic approach
(1)
residual correction
(1)
Papers
Reward Generalization in RLHF: A Topological Perspective
ACL 2025
OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research
JMLR 2024
Aligner: Efficient Alignment by Learning to Correct
NIPS 2024
Safe RLHF: Safe Reinforcement Learning from Human Feedback
ICLR 2024
Safety Gymnasium: A Unified Safe Reinforcement Learning Benchmark
NIPS 2023
Proactive Multi-Camera Collaboration for 3D Human Pose Estimation
ICLR 2023
BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset
NIPS 2023
MATE: Benchmarking Multi-Agent Reinforcement Learning in Distributed Target Coverage Control
NIPS 2022