Zhichen Dong
6 papers · 2024–2025 · 5 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+1 more ↓ Show less ↑
🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌍 Conference Polyglot (5) 🐝 Cross-Pollinator (13) 🐣 Hot Topic Early Bird
⚡
Prolific Year
(5)
Conferences
ICLR (2)
ACL (1)
ICML (1)
NAACL (1)
NIPS (1)
Top co-authors
Keywords
large language model
(2)
harmful content
(2)
text generation
(1)
instruction following
(1)
model merging
(1)
safety alignment
(1)
model alignment
(1)
greedy search
(1)
language model
(1)
safety evaluation
(1)
weak-to-strong generalization
(1)
human preference alignment
(1)
test-time optimization
(1)
red teaming
(1)
token distribution
(1)
prompt injection
(1)
defense mechanism
(1)
model compression
(1)
conversation safety
(1)
adversarial learning
(1)
Papers
Emergent Response Planning in LLMs
ICML 2025
Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models
NIPS 2024
Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!
ACL 2024
Towards Imitation Learning to Branch for MIP: A Hybrid Reinforcement Learning based Sample Augmentation Approach
ICLR 2024
L2P-MIP: Learning to Presolve for Mixed Integer Programming
ICLR 2024
Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey
NAACL 2024