Su Lin Blodgett
25 papers · 2016–2026 · 6 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+13 more ↓ Show less ↑
๐ Academic Marathon (9) ๐ Conference Polyglot (6) ๐ Interdisciplinary Bridge ๐งญ Keyword Pioneer ๐ Cross-Pollinator (6)
๐
Renaissance Researcher
(6)
๐
Conference Polyglot
(6)
๐
Academic Marathon
(9)
๐ฅ
Mega-Team
(20)
๐ค
Dynamic Duo
(10)
๐ฌ
Deep Specialist
(11)
๐งฌ
Topic Evolution
๐
Keyword Champion
(4)
๐๏ธ
Keyword Collector
(107)
โก
Prolific Year
(6)
๐ฅ
Unstoppable
(6)
๐
Century Club
(24)
โ
The Questioner
Conferences
ACL (12)
EMNLP (5)
NAACL (4)
IJCNLP (2)
AAAI (1)
ICML (1)
Top co-authors
Keywords
natural language processing
(5)
nlp evaluation
(4)
measurement modeling
(4)
stereotyping detection
(3)
responsible ai
(3)
bias evaluation
(3)
bias detection
(3)
human-centered evaluation
(2)
dependency parsing
(2)
measurement model
(2)
text generation
(2)
natural language generation
(2)
measurement validity
(2)
racial bia
(2)
coreference resolution
(2)
representational harm
(2)
fairness benchmark
(2)
text summarization
(1)
semantic analysis
(1)
data augmentation
(1)
Papers
Illusions of the Gold Standard: A Large-scale Analysis of Human Evaluation Protocols for Long-form Text Generation
ACL 2026
Understanding and Meeting Practitioner Needs When Measuring Representational Harms Caused by LLM-Based Systems
ACL 2025
Dehumanizing Machines: Mitigating Anthropomorphic Behaviors in Text Generation Systems
ACL 2025
Position: Evaluating Generative AI Systems Is a Social Science Measurement Challenge
ICML 2025
Understanding the Impacts of Language Technologiesโ Performance Disparities on African American Language Speakers
ACL 2024
Human-Centered Evaluation of Language Technologies
EMNLP 2024
The Perspectivist Paradigm Shift: Assumptions and Challenges of Capturing Human Labels
NAACL 2024
โOne-Size-Fits-Allโ? Examining Expectations around What Constitute โFairโ or โGoodโ NLG System Behaviors
NAACL 2024
ECBD: Evidence-Centered Benchmark Design for NLP
ACL 2024
Metrics for What, Metrics for Whom: Assessing Actionability of Bias Evaluation Metrics in NLP
EMNLP 2024
Taxonomizing and Measuring Representational Harms: A Look at Image Tagging
AAAI 2023
FairPrism: Evaluating Fairness-Related Harms in Text Generation
ACL 2023
This prompt is measuring <mask>: evaluating bias evaluation in language models
ACL 2023
It Takes Two to Tango: Navigating Conceptualizations of NLP Tasks and Measurements of Performance
ACL 2023
Responsible AI Considerations in Text Summarization Research: A Review of Current Practices
EMNLP 2023
Deconstructing NLG Evaluation: Evaluation Practices, Assumptions, and Their Implications
NAACL 2022
Examining Political Rhetoric with Epistemic Stance Detection
EMNLP 2022
Stereotyping Norwegian Salmon: An Inventory of Pitfalls in Fairness Benchmark Datasets
ACL 2021
A Survey of Race, Racism, and Anti-Racism in NLP
IJCNLP 2021
Stereotyping Norwegian Salmon: An Inventory of Pitfalls in Fairness Benchmark Datasets
IJCNLP 2021
A Survey of Race, Racism, and Anti-Racism in NLP
ACL 2021
Language (Technology) is Power: A Critical Survey of โBiasโ in NLP
ACL 2020
Twitter Universal Dependency Parsing for African-American and Mainstream American English
ACL 2018
Monte Carlo Syntax Marginals for Exploring and Using Dependency Parses
NAACL 2018
Demographic Dialectal Variation in Social Media: A Case Study of African-American English
EMNLP 2016