conftrace_

Kavel Rao

5 papers · 2023–2025 · 3 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+3 more ↓

🐝 Cross-Pollinator (15) 🌍 Conference Polyglot (3) 🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (18)

🧭 Keyword Pioneer 🐣 Hot Topic Early Bird ❓ The Questioner

Conferences

AAAI (2) NIPS (2) EMNLP (1)

Top co-authors

Liwei Jiang (5) Nouha Dziri (5) Yejin Choi (4) Allyson Ettinger (3) Maarten Sap (2) Valentina Pyatkin (2) Seungju Han (2) Ximing Lu (2) Faeze Brahman (2) Taylor Sorensen (1)

Keywords

large language model (5) multi-task learning (2) self-supervised learning (1) contextual reasoning (1) intent classification (1) responsible ai (1) ai safety (1) human decision-making (1) adversarial attack (1) value alignment (1) human value (1) interpretable model (1) multi-task model (1) commonsense reasoning (1) synthetic dataset (1) risk classification (1) jailbreak detection (1) llm safety (1) safety training (1) moral reasoning (1)

Papers

To Err Is AI: A Case Study Informing LLM Flaw Reporting Practices AAAI 2025 WildGuard: Open One-stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs NIPS 2024 WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models NIPS 2024 Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties AAAI 2024 What Makes it Ok to Set a Fire? Iterative Self-distillation of Contexts and Rationales for Disambiguating Defeasible Social and Moral Situations EMNLP 2023