conftrace_

Kaile Wang

4 papers · 2025–2025 · 2 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+1 more ↓

🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌍 Conference Polyglot (2) 🐝 Cross-Pollinator (7) 🌈 Renaissance Researcher (5)

🗺️ Taxonomy Completionist (19)

Conferences

ACL (3) AAAI (1)

Top co-authors

Jiaming Ji (4) Yaodong Yang (4) Tianyi Alex Qiu (3) Jiayi Zhou (3) Josef Dai (3) Boyuan Chen (2) Hantao Lou (2) Boxun Li (1) Borong Zhang (1) Dong Yan (1)

Keywords

reinforcement learning from human feedback (3) large language model (2) language model alignment (2) preference learning (1) safety alignment (1) data compression (1) bayesian network (1) human preference (1) preference datum (1) language alignment (1) alignment fine-tuning (1) model elasticity (1) pre-training distribution (1) harmful output mitigation (1) distribution induction (1) sentence-level alignment (1) helpful assistant (1) elastic collapse (1) helpful harmlessness (1) harmlessness annotation (1)

Papers

Stream Aligner: Efficient Sentence-Level Alignment via Distribution Induction AAAI 2025 Language Models Resist Alignment: Evidence From Data Compression ACL 2025 PKU-SafeRLHF: Towards Multi-Level Safety Alignment for LLMs with Human Preference ACL 2025 Reward Generalization in RLHF: A Topological Perspective ACL 2025