Wenjie Jacky Mo
5 papers · 2025–2026 · 4 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓
🌍
Conference Polyglot
(4)
🌉
Interdisciplinary Bridge
🐝
Cross-Pollinator
(15)
👥
Mega-Team
(21)
Conferences
ACL (2)
EMNLP (1)
ICLR (1)
NAACL (1)
Top co-authors
Keywords
large language model
(3)
adversarial attack
(2)
backdoor attack
(2)
ai safety
(1)
language model
(1)
model fine-tuning
(1)
red teaming
(1)
black-box model
(1)
backdoor detection
(1)
multi-turn conversation
(1)
safety classification
(1)
poisoned datum
(1)
trigger inversion
(1)
meta classifier
(1)
model security
(1)
test-time defense
(1)
in-context learning
(1)
code generation
(1)
Papers
RedCoder: Automated Multi-Turn Red Teaming for Code LLMs
ACL 2026
ThinkGuard: Deliberative Slow Thinking Leads to Cautious Guardrails
ACL 2025
Rethinking Backdoor Detection Evaluation for Language Models
EMNLP 2025
MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding
ICLR 2025
Test-time Backdoor Mitigation for Black-Box Large Language Models with Defensive Demonstrations
NAACL 2025