Desmond Elliott
56 papers · 2013–2026 · 13 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+13 more ↓ Show less ↑
π Academic Marathon (13) π Interdisciplinary Bridge π§ Keyword Pioneer π Conference Polyglot (12) π Cross-Pollinator (3)
π
Conference Polyglot
(12)
π
Academic Marathon
(13)
π
Renaissance Researcher
(8)
π
Keyword Trendsetter Combo
(3)
π¬
Deep Specialist
(21)
π
Keyword Champion
(3)
π₯
Mega-Team
(20)
π
Century Club
(55)
ποΈ
Keyword Collector
(210)
β‘
Prolific Year
(9)
β
The Questioner
(5)
π₯
Unstoppable
(14)
π
Trend Setter
Conferences
EMNLP (19)
ACL (13)
EACL (5)
IJCNLP (5)
NAACL (4)
COLING (2)
CONLL (2)
AACL (1)
CVPR (1)
ICLR (1)
ICML (1)
IJCAI (1)
WACV (1)
Top co-authors
Research topics
Keywords
multimodal learning
(11)
image captioning
(8)
vision-language model
(8)
visual grounding
(5)
diagnostic classifier
(5)
transfer learning
(4)
multimodal machine translation
(4)
visual question answering
(3)
text classification
(3)
large language model
(3)
cultural bia
(3)
zero-shot learning
(3)
retrieval-augmented generation
(3)
retrieval augmentation
(3)
cross-lingual transfer
(3)
vision language model
(3)
visual context
(3)
language model
(3)
representation learning
(3)
cross-modal representation
(3)
Papers
ImageChain: Advancing Sequential Image-to-Text Reasoning in Multimodal Large Language Models
WACV 2026
Can Community Notes Replace Professional Fact-Checkers?
ACL 2025
How Do Multilingual Language Models Remember Facts?
ACL 2025
Seeing What Tastes Good: Revisiting Multimodal Distributional Semantics in the Billion Parameter Era
ACL 2025
Uncovering Cultural Representation Disparities in Vision-Language Models
AACL 2025
Uncovering Cultural Representation Disparities in Vision-Language Models
IJCNLP 2025
Multilingual Pretraining for Pixel Language Models
EMNLP 2025
LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks
ACL 2025
Tracking Universal Features Through Fine-Tuning and Model Merging
NAACL 2025
The Role of Data Curation in Image Captioning
EACL 2024
PAELLA: Parameter-Efficient Lightweight Language-Agnostic Captioning Model
NAACL 2024
Sequential Compositional Generalization in Multimodal Models
NAACL 2024
Understanding Retrieval Robustness for Retrieval-augmented Image Captioning
ACL 2024
FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese Food Culture
EMNLP 2024
LMCap: Few-shot Multilingual Image Captioning by Retrieval Augmented Language Model Prompting
ACL 2023
PHD: Pixel-Based Language Modeling of Historical Documents
EMNLP 2023
Visual Prediction Improves Zero-Shot Cross-Modal Machine Translation
EMNLP 2023
SmallCap: Lightweight Image Captioning Prompted With Retrieval Augmentation
CVPR 2023
Text Rendering Strategies for Pixel Language Models
EMNLP 2023
Evaluating Bias and Fairness in Gender-Neutral Pretrained Vision-and-Language Models
EMNLP 2023
Retrieval-augmented Image Captioning
EACL 2023
MultiFin: A Dataset for Multilingual Financial NLP
EACL 2023
Language Modelling with Pixels
ICLR 2023
Revisiting Transformer-based Models for Long Document Classification
EMNLP 2022
Multilingual Multimodal Learning with Machine Translated Text
EMNLP 2022
IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages
ICML 2022
The Role of Syntactic Planning in Compositional Image Captioning
EACL 2021
Probing Cross-Modal Representations in Multi-Step Relational Reasoning
ACL 2021
Vision-and-Language or Vision-for-Language? On Cross-Modal Influence in Multimodal Transformers
EMNLP 2021
Visually Grounded Reasoning across Languages and Cultures
EMNLP 2021
mDAPT: Multilingual Domain Adaptive Pretraining in a Single Model
EMNLP 2021
Probing Cross-Modal Representations in Multi-Step Relational Reasoning
IJCNLP 2021
The Sensitivity of Language Models and Humans to Winograd Schema Perturbations
ACL 2020
Textual Supervision for Visually Grounded Spoken Language Understanding
EMNLP 2020
Multimodal Speech Recognition with Unstructured Audio Masking
EMNLP 2020
Fine-Grained Grounding for Multimodal Speech Recognition
EMNLP 2020
CompGuessWhat?!: A Multi-task Evaluation Framework for Grounded Language Learning
ACL 2020
On Forgetting to Cite Older Papers: An Analysis of the ACL Anthology
ACL 2020
Cross-lingual Visual Verb Sense Disambiguation
NAACL 2019
Adversarial Removal of Demographic Attributes Revisited
IJCNLP 2019
Adversarial Removal of Demographic Attributes Revisited
EMNLP 2019
Compositional Generalization in Image Captioning
CONLL 2019
Understanding the Effect of Textual Adversaries in Multimodal Machine Translation
EMNLP 2019
Findings of the Third Shared Task on Multimodal Machine Translation
EMNLP 2018
Lessons Learned in Multilingual Grounded Language Learning
CONLL 2018
Measuring the Diversity of Automatic Image Descriptions
COLING 2018
Adversarial Evaluation of Multimodal Machine Translation
EMNLP 2018
Imagination Improves Multimodal Translation
IJCNLP 2017
Automatic Description Generation from Images: A Survey of Models, Datasets, and Evaluation Measures (Extended Abstract)
IJCAI 2017
Multimodal Learning and Reasoning
ACL 2016
Describing Images using Inferred Visual Dependency Representations
IJCNLP 2015
Describing Images using Inferred Visual Dependency Representations
ACL 2015
Comparing Automatic Evaluation Measures for Image Description
ACL 2014
Proceedings of the Student Research Workshop at the 14th Conference of the European Chapter of the Association for Computational Linguistics
EACL 2014
Query-by-Example Image Retrieval using Visual Dependency Representations
COLING 2014
Image Description using Visual Dependency Representations
EMNLP 2013