Rongwu Xu
14 papers · 2024–2026 · 5 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+5 more ↓ Show less ↑
๐ Cross-Pollinator (15) ๐ Interdisciplinary Bridge ๐บ๏ธ Taxonomy Completionist (29) ๐งญ Keyword Pioneer ๐ฃ Hot Topic Early Bird
๐
Conference Polyglot
(4)
โก
Prolific Year
(9)
๐
Century Club
(12)
๐๏ธ
Keyword Collector
(62)
โ
The Questioner
Conferences
EMNLP (6)
ACL (5)
EACL (1)
ICLR (1)
NIPS (1)
Top co-authors
Keywords
large language model
(6)
safety alignment
(2)
jailbreak attack
(2)
adversarial robustness
(1)
preference learning
(1)
machine translation
(1)
model robustness
(1)
prompt engineering
(1)
chain-of-thought reasoning
(1)
neural machine translation
(1)
text generation
(1)
factual knowledge
(1)
language model evaluation
(1)
ai safety
(1)
harmful content
(1)
question answering
(1)
hidden state
(1)
bias mitigation
(1)
language model
(1)
reward modeling
(1)
Papers
DebateQA: Evaluating Question Answering on Debatable Knowledge
EACL 2026
AwarenessBench: Assessing Cognitive Capabilities of Language Models
ACL 2026
Nuclear Deployed!: Analyzing Catastrophic Risks in Decision-making of Autonomous LLM Agents
ACL 2025
Does Chain-of-Thought Reasoning Really Reduce Harmfulness from Jailbreaking?
ACL 2025
On the Role of Attention Heads in Large Language Model Safety
ICLR 2025
MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs
NIPS 2024
How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States
EMNLP 2024
LONG2RAG: Evaluating Long-Context & Long-Form Retrieval-Augmented Generation with Key Point Recall
EMNLP 2024
Sing it, Narrate it: Quality Musical Lyrics Translation
EMNLP 2024
Course-Correction: Safety Alignment Using Synthetic Preferences
EMNLP 2024
The Earth is Flat because...: Investigating LLMsโ Belief towards Misinformation via Persuasive Conversation
ACL 2024
Preemptive Answer โAttacksโ on Chain-of-Thought Reasoning
ACL 2024
Walking in Othersโ Shoes: How Perspective-Taking Guides Large Language Models in Reducing Toxicity and Bias
EMNLP 2024
Knowledge Conflicts for LLMs: A Survey
EMNLP 2024