conftrace_

Xuehai Pan

8 papers · 2022–2025 · 4 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+1 more ↓

🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌍 Conference Polyglot (4) 🐝 Cross-Pollinator (13) 🗺️ Taxonomy Completionist (19)

🐣 Hot Topic Early Bird

Conferences

NIPS (4) ICLR (2) ACL (1) JMLR (1)

Top co-authors

Yaodong Yang (7) Jiaming Ji (6) Mickel Liu (5) Josef Dai (4) Yizhou Wang (4) Ruiyang Sun (4) Jiayi Zhou (3) Borong Zhang (3) Yiran Geng (2) Weidong Huang (2)

Keywords

reinforcement learning from human feedback (3) safe reinforcement learning (2) large language model (2) preference learning (2) constraint optimization (1) knowledge distillation (1) content moderation (1) policy learning (1) language model alignment (1) ai safety (1) safety alignment (1) risk minimization (1) constraint satisfaction (1) human feedback (1) bayesian network (1) hallucination reduction (1) safety benchmark (1) alignment method (1) model-agnostic approach (1) residual correction (1)

Papers

Reward Generalization in RLHF: A Topological Perspective ACL 2025 OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research JMLR 2024 Aligner: Efficient Alignment by Learning to Correct NIPS 2024 Safe RLHF: Safe Reinforcement Learning from Human Feedback ICLR 2024 Safety Gymnasium: A Unified Safe Reinforcement Learning Benchmark NIPS 2023 Proactive Multi-Camera Collaboration for 3D Human Pose Estimation ICLR 2023 BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset NIPS 2023 MATE: Benchmarking Multi-Agent Reinforcement Learning in Distributed Target Coverage Control NIPS 2022