Zenghao Duan
5 papers · 2025–2026 · 3 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓
🌍
Conference Polyglot
(3)
🌉
Interdisciplinary Bridge
🧭
Keyword Pioneer
🐝
Cross-Pollinator
(15)
Conferences
ACL (3)
EMNLP (1)
NAACL (1)
Top co-authors
Keywords
adversarial attack
(2)
jailbreak attack
(2)
harmful content
(2)
ai safety
(1)
safety alignment
(1)
model alignment
(1)
llm safety
(1)
test-time scaling
(1)
large language model
(1)
harmful knowledge
(1)
reasoning dynamics
(1)
toxic content
(1)
llm detoxification
(1)
threat assessment
(1)
harmfulness judgment
(1)
toxic subspace
(1)
overthinking detection
(1)
early-exit method
(1)
adversarial learning
(1)
reasoning completion point
(1)
Papers
The Evolution of Thought: Tracking LLM Overthinking via Reasoning Dynamics Analysis
ACL 2026
Projecting Out the Malice: A Global Subspace Approach to LLM Detoxification
ACL 2026
from Benign import Toxic: Jailbreaking the Language Model via Adversarial Metaphors
ACL 2025
Confusion is the Final Barrier: Rethinking Jailbreak Evaluation and Investigating the Real Misuse Threat of LLMs
EMNLP 2025
Related Knowledge Perturbation Matters: Rethinking Multiple Pieces of Knowledge Editing in Same-Subject
NAACL 2025