conftrace_

Manish Nagireddy

9 papers · 2024–2026 · 6 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+2 more ↓

🗺️ Taxonomy Completionist (14) 🐝 Cross-Pollinator (15) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (6) 🧭 Keyword Pioneer

🐣 Hot Topic Early Bird 👥 Mega-Team (22)

Conferences

ACL (2) EMNLP (2) NAACL (2) AAAI (1) ICLR (1) IJCAI (1)

Top co-authors

Prasanna Sattigeri (6) Inkit Padhi (5) Pierre Dognin (4) Kush R. Varshney (3) Karthikeyan Natesan Ramamurthy (3) Erik Miehling (3) Ioana Baldini (2) Ronny Luss (2) Elizabeth M. Daly (2) Werner Geyer (2)

Keywords

large language model (5) value alignment (2) dialogue system (2) ai safety (1) synthetic data generation (1) language model (1) generative ai (1) hallucination detection (1) harmful content detection (1) multimodal model (1) reasoning trace (1) adversarial testing (1) human-ai interaction (1) faithful explanation (1) attribution method (1) explanation method (1) social bias detection (1) generative language model (1) moral reasoning (1) question answering benchmark (1)

Papers

Answering the Wrong Question: Reasoning Trace Inversion for Abstention in LLMs ACL 2026 Multi-Level Explanations for Generative Language Models ACL 2025 Programming Refusal with Conditional Activation Steering ICLR 2025 Granite Guardian: Comprehensive LLM Safeguarding NAACL 2025 DAMAGeR: Deploying Automatic and Manual Approaches to GenAI Red-teaming NAACL 2025 ComVas: Contextual Moral Values Alignment System IJCAI 2024 Value Alignment from Unstructured Text EMNLP 2024 Language Models in Dialogue: Conversational Maxims for Human-AI Interactions EMNLP 2024 SocialStigmaQA: A Benchmark to Uncover Stigma Amplification in Generative Language Models AAAI 2024