conftrace_

Luyao Niu

12 papers · 2023–2026 · 7 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+4 more ↓

🐝 Cross-Pollinator (5) 🐣 Hot Topic Early Bird 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🗺️ Taxonomy Completionist (20)

🌍 Conference Polyglot (7) 🤝 Dynamic Duo (10) 💎 Century Club (10) ⚡ Prolific Year (5)

Conferences

ACL (6) AAAI (1) EMNLP (1) ICLR (1) IJCAI (1) NAACL (1) NIPS (1)

Top co-authors

Radha Poovendran (12) Fengqing Jiang (10) Zhangchen Xu (9) Bill Yuchen Lin (7) Bhaskar Ramasubramanian (6) Yuetai Li (5) Bo Li (3) Dinuka Sahabandu (2) Xiang Yue (2) Jinyuan Jia (2)

Keywords

large language model (8) safety alignment (3) adversarial learning (3) jailbreak attack (3) knowledge distillation (2) chain-of-thought reasoning (2) decoding strategy (2) adversarial attack (2) llm safety (2) instruction tuning (2) backdoor attack (2) gaussian process (1) harmful content (1) model safety (1) model robustness (1) safety evaluation (1) adversarial training (1) social network (1) adversarial prompt (1) text generation (1)

Papers

BadScientist: Can a Research Agent Write Convincing but Unsound Papers that Fool LLM Reviewers? ACL 2026 Temporal Sampling for Forgotten Reasoning in LLMs ACL 2026 Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing ICLR 2025 SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities ACL 2025 Small Models Struggle to Learn from Strong Reasoners ACL 2025 ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates AAAI 2025 Stronger Models are Not Always Stronger Teachers for Instruction Tuning NAACL 2025 CleanGen: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models EMNLP 2024 SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding ACL 2024 ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs ACL 2024 Learning Dissemination Strategies for External Sources in Opinion Dynamic Models with Cognitive Biases IJCAI 2023 FedGame: A Game-Theoretic Defense against Backdoor Attacks in Federated Learning NIPS 2023