Zhenhong Zhou
13 papers · 2024–2026 · 5 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+4 more ↓ Show less ↑
π Cross-Pollinator (14) π Conference Polyglot (5) π§ Keyword Pioneer π£ Hot Topic Early Bird π Renaissance Researcher (5)
π
Interdisciplinary Bridge
πΊοΈ
Taxonomy Completionist
(11)
π
Century Club
(10)
β‘
Prolific Year
(5)
Conferences
EMNLP (5)
ACL (4)
AAAI (2)
ICLR (1)
ICML (1)
Top co-authors
Keywords
large language model
(6)
backdoor attack
(3)
safety alignment
(3)
jailbreak attack
(2)
resource consumption
(2)
privacy risk
(1)
harmful content
(1)
preference learning
(1)
adversarial defense
(1)
hidden state
(1)
synthetic datum
(1)
safety evaluation
(1)
model safety
(1)
training datum
(1)
representation space
(1)
black-box setting
(1)
model alignment
(1)
jailbreak defense
(1)
communication topology
(1)
adversarial machine learning
(1)
Papers
RiskLab: A Controlled Toolkit for Probing Emergent Risks in LLM-Based Multi-Agent Systems
ACL 2026
Hidden in the Noise: Unveiling Backdoors in Audio LLMs Alignment Through Latent Acoustic Pattern Triggers
AAAI 2026
Backdoor Collapse: Eliminating Unknown Threats Via Known Backdoor Aggregation In Language Models
ACL 2026
SEE: Signal Embedding Energy for Quantifying Noise Interference in Large Audio Language Models
ACL 2026
Reinforced Lifelong Editing for Language Models
ICML 2025
PD3F: A Pluggable and Dynamic DoS-Defense Framework against resource consumption attacks targeting Large Language Models
EMNLP 2025
Crabs: Consuming Resource via Auto-generation for LLM-DoS Attack under Black-box Settings
ACL 2025
DemonAgent: Dynamically Encrypted Multi-Backdoor Implantation Attack on LLM-based Agent
EMNLP 2025
On the Role of Attention Heads in Large Language Model Safety
ICLR 2025
Quantifying and Analyzing Entity-Level Memorization in Large Language Models
AAAI 2024
How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States
EMNLP 2024
Course-Correction: Safety Alignment Using Synthetic Preferences
EMNLP 2024
Alignment-Enhanced Decoding: Defending Jailbreaks via Token-Level Adaptive Refining of Probability Distributions
EMNLP 2024