Saad Mahamood
11 papers · 2021–2026 · 6 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+6 more ↓ Show less ↑
🗺️ Taxonomy Completionist (16) 🌍 Conference Polyglot (6) 🏃 Academic Marathon (5) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer
🌍
Conference Polyglot
(6)
🏃
Academic Marathon
(5)
👥
Mega-Team
(77)
💎
Century Club
(10)
❓
The Questioner
🔥
Unstoppable
(5)
Conferences
EACL (3)
ACL (2)
EMNLP (2)
NAACL (2)
COLING (1)
IJCNLP (1)
Top co-authors
Keywords
human evaluation
(6)
natural language generation
(5)
evaluation metric
(3)
summarization evaluation
(2)
inter-annotator agreement
(2)
automated metric
(2)
annotation quality
(2)
evaluation methodology
(2)
large language model
(2)
text summarization
(2)
user interface
(1)
reproducibility study
(1)
end-to-end approach
(1)
user experience
(1)
experimental methodology
(1)
nlp research
(1)
referring expression generation
(1)
pyramid evaluation
(1)
span annotation
(1)
annotator quality
(1)
Papers
LLMs as Span Annotators: A Comparative Study of LLMs and Humans
EACL 2026
Lessons from a User Experience Evaluation of NLP Interfaces
NAACL 2025
Real-World Summarization: When Evaluation Reaches Its Limits
EMNLP 2025
ReproHum #0124-03: Reproducing Human Evaluations of end-to-end approaches for Referring Expression Generation
COLING 2024
On the Role of Summary Content Units in Text Summarization Evaluation
NAACL 2024
A Needle in a Haystack: An Analysis of High-Agreement Workers on MTurk for Summarization
ACL 2023
Missing Information, Unresponsive Authors, Experimental Flaws: The Impossibility of Assessing the Reproducibility of Previous Human Evaluations in NLP
EACL 2023
GEMv2: Multilingual NLG Benchmarking in a Single Line of Code
EMNLP 2022
The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics
IJCNLP 2021
The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics
ACL 2021
It’s Commonsense, isn’t it? Demystifying Human Evaluations in Commonsense-Enhanced NLG Systems
EACL 2021