conftrace_

Zhichen Dong

6 papers · 2024–2025 · 5 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+1 more ↓

🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌍 Conference Polyglot (5) 🐝 Cross-Pollinator (13) 🐣 Hot Topic Early Bird

⚡ Prolific Year (5)

Conferences

ICLR (2) ACL (1) ICML (1) NAACL (1) NIPS (1)

Top co-authors

Chao Yang (4) Zhanhui Zhou (4) Yu Qiao (3) Jie Liu (2) Zhixuan Liu (2) Junchi Yan (2) Haobo Ma (1) Chaochao Lu (1) Bowen Pang (1) Wanli Ouyang (1)

Keywords

large language model (2) harmful content (2) text generation (1) instruction following (1) model merging (1) safety alignment (1) model alignment (1) greedy search (1) language model (1) safety evaluation (1) weak-to-strong generalization (1) human preference alignment (1) test-time optimization (1) red teaming (1) token distribution (1) prompt injection (1) defense mechanism (1) model compression (1) conversation safety (1) adversarial learning (1)

Papers

Emergent Response Planning in LLMs ICML 2025 Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models NIPS 2024 Emulated Disalignment: Safety Alignment for Large Language Models May Backfire! ACL 2024 Towards Imitation Learning to Branch for MIP: A Hybrid Reinforcement Learning based Sample Augmentation Approach ICLR 2024 L2P-MIP: Learning to Presolve for Mixed Integer Programming ICLR 2024 Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey NAACL 2024