Nouha Dziri

32 papers · 2018–2026 · 8 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🌈 Renaissance Researcher (6) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (8) 🏃 Academic Marathon (7) 🗺️ Taxonomy Completionist (52)

🐝 Cross-Pollinator (15) 🌍 Conference Polyglot (8) 🏃 Academic Marathon (7) 🤝 Dynamic Duo (16) 👑 Triple Crown 🏆 Grand Slam 🧬 Topic Evolution 🔥 Unstoppable (5) 💎 Century Club (30) 🗃️ Keyword Collector (103) ⚡ Prolific Year (7) ❓ The Questioner (2)

Conferences

ICLR (6) NAACL (6) NIPS (6) ACL (5) EMNLP (3) ICML (3) AAAI (2) ICCV (1)

Top co-authors

Yejin Choi (16) Liwei Jiang (12) Ximing Lu (9) Valentina Pyatkin (7) Khyathi Chandu (7) Osmar Zaïane (6) Allyson Ettinger (6) Abhilasha Ravichander (5) Bill Yuchen Lin (5) Kavel Rao (5)

Keywords

large language model (9) language model (4) reinforcement learning from human feedback (3) reward model (2) self-supervised learning (2) multi-task learning (2) safety training (2) hallucination detection (2) conversational model (2) dialogue generation (2) dialogue system (2) benchmark evaluation (2) text generation (2) natural language inference (1) factual accuracy (1) knowledge editing (1) representation learning (1) direct preference optimization (1) preference optimization (1) model calibration (1)

Papers

Knowledge Control for Responsible Generative AI: Bridging Academia, Industry, and Society ACL 2026 Current Advances in LLM Reasoning ACL 2026 Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction ICLR 2025 AI as Humanity’s Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text ICLR 2025 WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild ICLR 2025 To Err Is AI: A Case Study Informing LLM Flaw Reporting Practices AAAI 2025 REL-A.I.: An Interaction-Centered Approach To Measuring Human-LM Reliance NAACL 2025 RewardBench: Evaluating Reward Models for Language Modeling NAACL 2025 SafetyAnalyst: Interpretable, Transparent, and Steerable Safety Moderation for AI Behavior ICML 2025 The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning ICLR 2024 WildGuard: Open One-stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs NIPS 2024 WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models NIPS 2024 The Art of Saying No: Contextual Noncompliance in Language Models NIPS 2024 Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties AAAI 2024 Phenomenal Yet Puzzling: Testing Inductive Reasoning Capabilities of Language Models with Hypothesis Refinement ICLR 2024 The Generative AI Paradox: “What It Can Create, It May Not Understand” ICLR 2024 Position: A Roadmap to Pluralistic Alignment ICML 2024 Elastic Weight Removal for Faithful and Abstractive Dialogue Generation NAACL 2024 Fine-Grained Human Feedback Gives Better Rewards for Language Model Training NIPS 2023 What Makes it Ok to Set a Fire? Iterative Self-distillation of Contexts and Rationales for Disambiguating Defeasible Social and Moral Situations EMNLP 2023 Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs without Fine-tuning EMNLP 2023 CHAMPAGNE: Learning Real-world Conversation from Large-Scale Web Videos ICCV 2023 Self-Refine: Iterative Refinement with Self-Feedback NIPS 2023 Faith and Fate: Limits of Transformers on Compositionality NIPS 2023 Evaluating Open-Domain Question Answering in the Era of Large Language Models ACL 2023 On the Origin of Hallucinations in Conversational Models: Is it the Datasets or the Models? NAACL 2022 Neural Path Hunter: Reducing Hallucination in Dialogue Systems via Path Grounding EMNLP 2021 Decomposed Mutual Information Estimation for Contrastive Representation Learning ICML 2021 Augmenting Neural Response Generation with Context-Aware Topical Attention ACL 2019 Evaluating Coherence in Dialogue Systems using Entailment NAACL 2019 Evaluating Coherence in Dialogue Systems using Entailment ACL 2019 Automatic Dialogue Generation with Expressed Emotions NAACL 2018