Douwe Kiela

101 papers · 2013–2025 · 11 conferences · across top CS/AI conferences

Achievements

+18 more ↓

🏃 Academic Marathon (12) 🌍 Conference Polyglot (11) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🐣 Hot Topic Early Bird

🌉 Interdisciplinary Bridge 🏃 Academic Marathon (12) 🧭 Keyword Pioneer 🌟 Keyword Trendsetter Combo (4) 🏠 Conference Loyalist (21) 🏆 Grand Slam 👥 Mega-Team (47) 🌱 Topic Pioneer 👑 Triple Crown 🏆 Keyword Champion 🤝 Dynamic Duo (16) ⚡ Prolific Year (26) ❓ The Questioner (4) 💎 Century Club (101) 📈 Trend Setter 🔥 Unstoppable (13) 🗃️ Keyword Collector (340) 🚀 Conference Pioneer

Conferences

EMNLP (29) ACL (21) IJCNLP (12) NIPS (10) ICLR (8) NAACL (6) EACL (5) ICML (4) AAAI (3) CVPR (2) COLING (1)

Top co-authors

Jason Weston (16) Kyunghyun Cho (14) Tristan Thrush (13) Adina Williams (12) Stephen Clark (12) Amanpreet Singh (9) Emily Dinan (7) Robin Jia (7) Siddharth Karamcheti (6) Bertie Vidgen (6)

Research topics

Natural Language Processing (1)

Keywords

natural language inference (8) question answering (7) natural language processing (6) model evaluation (6) multimodal learning (6) adversarial learning (5) multi-agent system (5) data augmentation (5) model robustness (5) language model (5) adversarial training (4) representation learning (4) hate speech detection (4) dialogue system (4) text classification (4) benchmark dataset (4) visual grounding (3) semantic similarity (3) benchmark evaluation (3) transformer architecture (3)

Papers

LMUNIT: Fine-grained Evaluation with Natural Language Unit Tests EMNLP 2025 Great Models Think Alike and this Undermines AI Oversight ICML 2025 OLMoE: Open Mixture-of-Experts Language Models ICLR 2025 Generative Representational Instruction Tuning ICLR 2025 I am a Strange Dataset: Metalinguistic Tests for Language Models ACL 2024 Anchor Points: Benchmarking Models with Much Fewer Examples EACL 2024 Nearest Neighbor Normalization Improves Multimodal Retrieval EMNLP 2024 Model Alignment as Prospect Theoretic Optimization ICML 2024 Leveraging Diffusion Perturbations for Measuring Fairness in Computer Vision AAAI 2024 DataPerf: Benchmarks for Data-Centric AI Development NIPS 2023 Investigating Multi-source Active Learning for Natural Language Inference EACL 2023 OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents NIPS 2023 Perturbation Augmentation for Fairer NLP EMNLP 2022 Analyzing Dynamic Adversarial Training Data in the Limit ACL 2022 Models in the Loop: Aiding Crowdworkers with Generative Annotation Assistants NAACL 2022 Dynatask: A Framework for Creating Dynamic AI Benchmark Tasks ACL 2022 FLAVA: A Foundational Language and Vision Alignment Model CVPR 2022 Winoground: Probing Vision and Language Models for Visio-Linguistic Compositionality CVPR 2022 I like fish, especially dolphins: Addressing Contradictions in Dialogue Modeling ACL 2021 Cross-Modal Retrieval Augmentation for Multi-Modal Classification EMNLP 2021 Reservoir Transformers ACL 2021 On the Efficacy of Adversarial Data Collection for Question Answering: Results from a Large-Scale Randomized Study ACL 2021 Findings of the WOAH 5 Shared Task on Fine Grained Hateful Memes Detection ACL 2021 Findings of the WMT 2021 Shared Task on Large-Scale Multilingual Machine Translation EMNLP 2021 To what extent do human explanations of model behavior align with actual model behavior? EMNLP 2021 Retrieval Augmentation Reduces Hallucination in Conversation EMNLP 2021 DynaSent: A Dynamic Benchmark for Sentiment Analysis ACL 2021 Improving Question Answering Model Robustness with Synthetic Adversarial Data Generation EMNLP 2021 Gradient-based Adversarial Attacks against Text Transformers EMNLP 2021 What’s Hidden in a One-layer Randomly Weighted Transformer? EMNLP 2021 Masked Language Modeling and the Distributional Hypothesis: Order Word Matters Pre-training for Little EMNLP 2021 Dynaboard: An Evaluation-As-A-Service Platform for Holistic Next-Generation Benchmarking NIPS 2021 True Few-Shot Learning with Language Models NIPS 2021 Human-Adversarial Visual Question Answering NIPS 2021 Dynabench: Rethinking Benchmarking in NLP NAACL 2021 Findings of the WOAH 5 Shared Task on Fine Grained Hateful Memes Detection IJCNLP 2021 On the Efficacy of Adversarial Data Collection for Question Answering: Results from a Large-Scale Randomized Study IJCNLP 2021 Reservoir Transformers IJCNLP 2021 DynaSent: A Dynamic Benchmark for Sentiment Analysis IJCNLP 2021 I like fish, especially dolphins: Addressing Contradictions in Dialogue Modeling IJCNLP 2021 Learning from the Worst: Dynamically Generated Datasets to Improve Online Hate Detection IJCNLP 2021 Rissanen Data Analysis: Examining Dataset Characteristics via Description Length ICML 2021 Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval ICLR 2021 Learning from the Worst: Dynamically Generated Datasets to Improve Online Hate Detection ACL 2021 Adversarial NLI: A New Benchmark for Natural Language Understanding ACL 2020 The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes NIPS 2020 Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks NIPS 2020 Learning Optimal Representations with the Decodable Information Bottleneck NIPS 2020 Generating Interactive Worlds with Text AAAI 2020 Multi-Dimensional Gender Bias Classification EMNLP 2020 Queens are Powerful too: Mitigating Gender Bias in Dialogue Generation EMNLP 2020 Unsupervised Question Decomposition for Question Answering EMNLP 2020 On the interaction between supervision and self-play in emergent communication ICLR 2020 Finding Generalizable Evidence by Learning to Convince Q&A Models EMNLP 2019 Emergent Linguistic Phenomena in Multi-Agent Communication Games EMNLP 2019 Countering Language Drift via Visual Grounding EMNLP 2019 Seeded self-play for language learning EMNLP 2019 What makes a good conversation? How controllable attributes affect human judgments NAACL 2019 Analysis of Joint Multilingual Sentence Representations and Semantic K-Nearest Neighbor Graphs AAAI 2019 No Training Required: Exploring Random Encoders for Sentence Classification ICLR 2019 Hyperbolic Graph Neural Networks NIPS 2019 Inferring Concept Hierarchies from Text Corpora via Hyperbolic Embeddings ACL 2019 Learning to Speak and Act in a Fantasy Text Adventure Game IJCNLP 2019 Finding Generalizable Evidence by Learning to Convince Q&A Models IJCNLP 2019 Emergent Linguistic Phenomena in Multi-Agent Communication Games IJCNLP 2019 Countering Language Drift via Visual Grounding IJCNLP 2019 Learning to Speak and Act in a Fantasy Text Adventure Game EMNLP 2019 Learning Visually Grounded Sentence Representations NAACL 2018 Hearst Patterns Revisited: Automatic Hypernym Detection from Large Text Corpora ACL 2018 Learning Continuous Hierarchies in the Lorentz Model of Hyperbolic Geometry ICML 2018 Personalizing Dialogue Agents: I have a dog, do you have pets too? ACL 2018 Code-Switched Named Entity Recognition with Embedding Attention ACL 2018 Emergent Communication in a Multi-Modal, Multi-Step Referential Game ICLR 2018 Emergent Translation in Multi-Agent Communication ICLR 2018 Mastering the Dungeon: Grounded Language Learning by Mechanical Turker Descent ICLR 2018 Dynamic Meta-Embeddings for Improved Sentence Representations EMNLP 2018 Jump to better conclusions: SCAN both left and right EMNLP 2018 Automatically Generating Rhythmic Verse with Neural Networks ACL 2017 Learning to Negate Adjectives with Bilinear Models EACL 2017 Grasping the Finer Point: A Supervised Similarity Network for Metaphor Detection EMNLP 2017 Evaluation by Association: A Systematic Study of Quantitative Word Association Evaluation EACL 2017 Supervised Learning of Universal Sentence Representations from Natural Language Inference Data EMNLP 2017 Poincaré Embeddings for Learning Hierarchical Representations NIPS 2017 Multimodal Learning and Reasoning ACL 2016 Multi-Modal Representations for Improved Bilingual Lexicon Learning ACL 2016 Robust Text Classification for Sparsely Labelled Data Using Multi-level Embeddings COLING 2016 Black Holes and White Rabbits: Metaphor Identification with Visual Features NAACL 2016 Comparing Data Sources and Architectures for Deep Visual Representation Learning in Semantics EMNLP 2016 Vision and Feature Norms: Improving automatic feature norm learning through cross-modal maps NAACL 2016 MMFeat: A Toolkit for Extracting Multi-Modal Features ACL 2016 Grounding Semantics in Olfactory Perception IJCNLP 2015 Exploiting Image Generality for Lexical Entailment Detection ACL 2015 Exploiting Image Generality for Lexical Entailment Detection IJCNLP 2015 Grounding Semantics in Olfactory Perception ACL 2015 Multi- and Cross-Modal Semantics Beyond Vision: Grounding in Auditory Perception EMNLP 2015 Specializing Word Embeddings for Similarity or Relatedness EMNLP 2015 Visual Bilingual Lexicon Induction with Transferred ConvNet Features EMNLP 2015 Learning Image Embeddings using Convolutional Neural Networks for Improved Multi-Modal Semantics EMNLP 2014 Improving Multi-Modal Representations Using Image Dispersion: Why Less is Sometimes More ACL 2014 Proceedings of the Student Research Workshop at the 14th Conference of the European Chapter of the Association for Computational Linguistics EACL 2014 Detecting Compositionality of Multi-Word Expressions using Nearest Neighbours in Vector Space Models EMNLP 2013