Kaile Wang
4 papers · 2025–2025 · 2 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+1 more ↓ Show less ↑
π Interdisciplinary Bridge π§ Keyword Pioneer π Conference Polyglot (2) π Cross-Pollinator (7) π Renaissance Researcher (5)
πΊοΈ
Taxonomy Completionist
(19)
Conferences
ACL (3)
AAAI (1)
Top co-authors
Keywords
reinforcement learning from human feedback
(3)
large language model
(2)
language model alignment
(2)
preference learning
(1)
safety alignment
(1)
data compression
(1)
bayesian network
(1)
human preference
(1)
preference datum
(1)
language alignment
(1)
alignment fine-tuning
(1)
model elasticity
(1)
pre-training distribution
(1)
harmful output mitigation
(1)
distribution induction
(1)
sentence-level alignment
(1)
helpful assistant
(1)
elastic collapse
(1)
helpful harmlessness
(1)
harmlessness annotation
(1)
Papers
Stream Aligner: Efficient Sentence-Level Alignment via Distribution Induction
AAAI 2025
Language Models Resist Alignment: Evidence From Data Compression
ACL 2025
PKU-SafeRLHF: Towards Multi-Level Safety Alignment for LLMs with Human Preference
ACL 2025
Reward Generalization in RLHF: A Topological Perspective
ACL 2025