conftrace_

Liu Yan

6 papers · 2024–2026 · 3 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

🧭 Keyword Pioneer 🌍 Conference Polyglot (3) 🐝 Cross-Pollinator (12) 🌉 Interdisciplinary Bridge

Conferences

EMNLP (3) ACL (2) ICLR (1)

Top co-authors

Han Qiu (6) Tianwei Zhang (4) Minlie Huang (3) Qingjie Zhang (3) Di Wang (3) Ke Xu (3) Chao Zhang (3) Haiqin Weng (2) Jianshuo Dong (2) Yiming Li (2)

Research topics

Security & Privacy (1)

Keywords

large language model (3) data augmentation (1) prompt engineering (1) instruction following (1) intent detection (1) harmful content (1) safety alignment (1) hidden state (1) synthetic datum (1) jailbreak attack (1) linear probing (1) cognitive bia (1) training datum (1) token analysis (1) model internal (1) chinese language (1) prompt bia (1) probing classifier (1) linear probe (1) prompt leakage (1)

Papers

Revisiting the Reliability of Language Models in Instruction-Following ACL 2026 Understanding the Dark Side of LLMs’ Intrinsic Self-Correction ACL 2025 “I’ve Decided to Leak”: Probing Internals Behind Prompt Leakage Intents EMNLP 2025 Speculating LLMs’ Chinese Training Data Pollution from Their Tokens EMNLP 2025 A Benchmark for Semantic Sensitive Information in LLMs Outputs ICLR 2025 Course-Correction: Safety Alignment Using Synthetic Preferences EMNLP 2024