Ninareh Mehrabi

19 papers · 2021–2026 · 5 conferences · across top CS/AI conferences

Achievements

+9 more ↓

🗺️ Taxonomy Completionist (42) 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (5) 🌍 Conference Polyglot (5) 🧭 Keyword Pioneer

🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🤝 Dynamic Duo (16) ❓ The Questioner ⚡ Prolific Year (9) 💎 Century Club (18) 📈 Trend Setter 🔥 Unstoppable (5) 🗃️ Keyword Collector (82)

Conferences

NAACL (7) EMNLP (6) ACL (4) AAAI (1) EACL (1)

Top co-authors

Aram Galstyan (17) Rahul Gupta (13) Kai-Wei Chang (12) Palash Goyal (8) Richard Zemel (6) Charith Peris (5) Anil Ramakrishna (5) Fred Morstatter (4) Jwala Dhamala (4) Greg Ver Steeg (2)

Keywords

large language model (8) bias mitigation (3) text generation (3) in-context learning (2) toxicity detection (2) adversarial attack (2) bias detection (1) data augmentation (1) data poisoning (1) chain-of-thought reasoning (1) commonsense knowledge (1) attention mechanism (1) question answering (1) knowledge graph question answering (1) text-to-image generation (1) ambiguity resolution (1) intent classification (1) responsible ai (1) abstract meaning representation (1) kl divergence (1)

Papers

SWAN: Semantic Watermarking with Abstract Meaning Representation ACL 2026 Diagnosing Memorization in Chain-of-Thought Reasoning, One Token at a Time EMNLP 2025 Towards Safety Reasoning in LLMs: AI-agentic Deliberation for Policy-embedded CoT Data Creation ACL 2025 DiCoRe: Enhancing Zero-shot Event Detection via Divergent-Convergent LLM Reasoning EMNLP 2025 On Localizing and Deleting Toxic Memories in Large Language Models NAACL 2025 Tokenization Matters: Navigating Data-Scarce Tokenization for Gender Inclusive Language Technologies NAACL 2024 Tree-of-Traversals: A Zero-Shot Reasoning Algorithm for Augmenting Black-box Language Models with Knowledge Graphs ACL 2024 Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models EMNLP 2024 Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification EMNLP 2024 Prompt Perturbation Consistency Learning for Robust Language Models EACL 2024 MICo: Preventative Detoxification of Large Language Models through Inhibition Control NAACL 2024 FLIRT: Feedback Loop In-context Red Teaming EMNLP 2024 BELIEVE: Belief-Enhanced Instruction Generation and Augmentation for Zero-Shot Bias Mitigation NAACL 2024 The steerability of large language models toward data-driven personas NAACL 2024 Resolving Ambiguities in Text-to-Image Generative Models ACL 2023 Attributing Fair Decisions with Attention Interventions NAACL 2022 Robust Conversational Agents against Imperceptible Toxicity Triggers NAACL 2022 Lawyers are Dishonest? Quantifying Representational Harms in Commonsense Knowledge Resources EMNLP 2021 Exacerbating Algorithmic Bias through Fairness Attacks AAAI 2021