Donghai Hong
4 papers · 2024–2025 · 3 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+1 more ↓ Show less ↑
π Interdisciplinary Bridge π§ Keyword Pioneer π Conference Polyglot (3) π Cross-Pollinator (13) πΊοΈ Taxonomy Completionist (18)
π₯
Mega-Team
(35)
Conferences
ACL (2)
NAACL (1)
NIPS (1)
Top co-authors
Keywords
large language model
(4)
reinforcement learning from human feedback
(2)
preference learning
(1)
knowledge distillation
(1)
question answering
(1)
model safety
(1)
responsible ai
(1)
safety alignment
(1)
monte carlo tree search
(1)
human feedback
(1)
reward model
(1)
hallucination reduction
(1)
alignment method
(1)
model-agnostic approach
(1)
residual correction
(1)
human preference
(1)
policy model
(1)
preference datum
(1)
test-time scaling
(1)
leaderboard evaluation
(1)
Papers
PKU-SafeRLHF: Towards Multi-Level Safety Alignment for LLMs with Human Preference
ACL 2025
Boosting Policy and Process Reward Models with Monte Carlo Tree Search in Open-Domain QA
ACL 2025
Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability
NAACL 2025
Aligner: Efficient Alignment by Learning to Correct
NIPS 2024