conftrace_

Junxiao Yang

5 papers · 2024–2026 · 2 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐝 Cross-Pollinator (8) 🌍 Conference Polyglot (2) 🏆 Keyword Champion (2)

Conferences

ACL (4) AAAI (1)

Top co-authors

Minlie Huang (5) Hongning Wang (5) Zhexin Zhang (5) Shiyao Cui (4) Han Qiu (2) Yingkang Wang (2) Fei Mi (2) Jinzhe Tu (1) Lifeng Shang (1) Jiaqi Weng (1)

Keywords

safety alignment (2) jailbreaking attack (2) attack success rate (2) supervised fine-tuning (1) jailbreak attack (1) goal prioritization (1) reasoning process (1) data distillation (1) large reasoning model (1) llm security (1) multilingual safety (1) language-agnostic representation (1) large language model (1) safety enhancement (1) semantic cognition (1) safety defense (1) emoji-triggered toxicity (1) gradient-based optimization (1) semantic bottleneck (1) instruction following (1)

Papers

When Smiley Turns Hostile: Interpreting How Emojis Trigger LLMs’ Toxicity AAAI 2026 How Should We Enhance the Safety of Large Reasoning Models: An Empirical Study ACL 2026 LASA: Language-Agnostic Semantic Alignment at the Semantic Bottleneck for LLM Safety ACL 2026 Guiding not Forcing: Enhancing the Transferability of Jailbreaking Attacks on LLMs via Removing Superfluous Constraints ACL 2025 Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization ACL 2024