Kavel Rao
5 papers · 2023–2025 · 3 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+3 more ↓ Show less ↑
π Cross-Pollinator (15) π Conference Polyglot (3) π Renaissance Researcher (5) π Interdisciplinary Bridge πΊοΈ Taxonomy Completionist (18)
π§
Keyword Pioneer
π£
Hot Topic Early Bird
β
The Questioner
Conferences
AAAI (2)
NIPS (2)
EMNLP (1)
Top co-authors
Keywords
large language model
(5)
multi-task learning
(2)
self-supervised learning
(1)
contextual reasoning
(1)
intent classification
(1)
responsible ai
(1)
ai safety
(1)
human decision-making
(1)
adversarial attack
(1)
value alignment
(1)
human value
(1)
interpretable model
(1)
multi-task model
(1)
commonsense reasoning
(1)
synthetic dataset
(1)
risk classification
(1)
jailbreak detection
(1)
llm safety
(1)
safety training
(1)
moral reasoning
(1)
Papers
To Err Is AI: A Case Study Informing LLM Flaw Reporting Practices
AAAI 2025
WildGuard: Open One-stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs
NIPS 2024
WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models
NIPS 2024
Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties
AAAI 2024
What Makes it Ok to Set a Fire? Iterative Self-distillation of Contexts and Rationales for Disambiguating Defeasible Social and Moral Situations
EMNLP 2023