Junxiao Yang
5 papers · 2024–2026 · 2 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓
🌉
Interdisciplinary Bridge
🧭
Keyword Pioneer
🐝
Cross-Pollinator
(8)
🌍
Conference Polyglot
(2)
🏆
Keyword Champion
(2)
Conferences
ACL (4)
AAAI (1)
Top co-authors
Keywords
safety alignment
(2)
jailbreaking attack
(2)
attack success rate
(2)
supervised fine-tuning
(1)
jailbreak attack
(1)
goal prioritization
(1)
reasoning process
(1)
data distillation
(1)
large reasoning model
(1)
llm security
(1)
multilingual safety
(1)
language-agnostic representation
(1)
large language model
(1)
safety enhancement
(1)
semantic cognition
(1)
safety defense
(1)
emoji-triggered toxicity
(1)
gradient-based optimization
(1)
semantic bottleneck
(1)
instruction following
(1)
Papers
When Smiley Turns Hostile: Interpreting How Emojis Trigger LLMs’ Toxicity
AAAI 2026
How Should We Enhance the Safety of Large Reasoning Models: An Empirical Study
ACL 2026
LASA: Language-Agnostic Semantic Alignment at the Semantic Bottleneck for LLM Safety
ACL 2026
Guiding not Forcing: Enhancing the Transferability of Jailbreaking Attacks on LLMs via Removing Superfluous Constraints
ACL 2025
Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization
ACL 2024