Verena Rieser
51 papers · 2005–2025 · 11 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+14 more ↓ Show less ↑
π Academic Marathon (20) π Conference Polyglot (11) π Interdisciplinary Bridge π§ Keyword Pioneer π Cross-Pollinator (9)
π
Renaissance Researcher
(8)
π
Conference Polyglot
(11)
π
Academic Marathon
(20)
π₯
Mega-Team
(42)
π€
Dynamic Duo
(15)
π¬
Deep Specialist
(10)
π§¬
Topic Evolution
π
Keyword Champion
(2)
π
Trend Setter
ποΈ
Keyword Collector
(174)
β‘
Prolific Year
(11)
π₯
Unstoppable
(6)
π
Century Club
(51)
β
The Questioner
(7)
Conferences
ACL (16)
EMNLP (15)
IJCNLP (4)
COLING (3)
EACL (3)
NAACL (3)
SEMEVAL (3)
AACL (1)
CONLL (1)
ICLR (1)
IJCAI (1)
Top co-authors
Keywords
conversational ai
(7)
human evaluation
(5)
text generation
(4)
natural language generation
(4)
conversational assistant
(3)
human-ai interaction
(3)
text classification
(3)
open-domain dialogue
(3)
multimodal learning
(3)
sentiment analysis
(2)
dialogue system
(2)
human annotation
(2)
abstractive summarization
(2)
commonsense reasoning
(2)
safety evaluation
(2)
conversational agent
(2)
language model
(2)
prompt engineering
(2)
natural language processing
(2)
crowdsourced annotation
(2)
Papers
Consistency is Key: Disentangling Label Variation in Natural Language Processing with Intra-Annotator Agreement
EMNLP 2025
CulturalFrames: Assessing Cultural Expectation Alignment in Text-to-Image Models and Evaluation Metrics
EMNLP 2025
Value Profiles for Encoding Human Variation
EMNLP 2025
Century: A Framework and Dataset for Evaluating Historical Contextualisation of Sensitive Images
ICLR 2025
STAR: SocioTechnical Approach to Red Teaming Language Models
EMNLP 2024
ReproHum #0927-03: DExpert Evaluation? Reproducing Human Judgements of the Fluency of Generated Text
COLING 2024
The Dangers of trusting Stochastic Parrots: Faithfulness and Trust in Open-domain Conversational Question Answering
ACL 2023
Adversarial Textual Robustness on Visual Dialog
ACL 2023
iLab at SemEval-2023 Task 11 Le-Wi-Di: Modelling Disagreement or Modelling Perspectives?
ACL 2023
SemEval-2023 Task 11: Learning with Disagreements (LeWiDi)
ACL 2023
Missing Information, Unresponsive Authors, Experimental Flaws: The Impossibility of Assessing the Reproducibility of Previous Human Evaluations in NLP
EACL 2023
Resources for Automated Identification of Online Gender-Based Violence: A Systematic Review
ACL 2023
iLab at SemEval-2023 Task 11 Le-Wi-Di: Modelling Disagreement or Modelling Perspectives?
SEMEVAL 2023
Quality-agnostic Image Captioning to Safely Assist People with Vision Impairment
IJCAI 2023
Mirages. On Anthropomorphism in Dialogue Systems
EMNLP 2023
Multitask Multimodal Prompted Training for Interactive Embodied Task Completion
EMNLP 2023
SemEval-2023 Task 11: Learning with Disagreements (LeWiDi)
SEMEVAL 2023
Risk-graded Safety for Handling Medical Queries in Conversational AI
AACL 2022
SafetyKit: First Aid for Measuring Safety in Open-domain Conversational Systems
ACL 2022
Risk-graded Safety for Handling Medical Queries in Conversational AI
IJCNLP 2022
Alexa, Google, Siri: What are Your Pronouns? Gender and Anthropomorphism in the Design and Perception of Conversational Assistants
IJCNLP 2021
ConvAbuse: Data, Analysis, and Benchmarks for Nuanced Abuse Detection in Conversational AI
EMNLP 2021
OTTers: One-turn Topic Transitions for Open-Domain Dialogue
IJCNLP 2021
AggGen: Ordering and Aggregating while Generating
IJCNLP 2021
AggGen: Ordering and Aggregating while Generating
ACL 2021
OTTers: One-turn Topic Transitions for Open-Domain Dialogue
ACL 2021
Alexa, Google, Siri: What are Your Pronouns? Gender and Anthropomorphism in the Design and Perception of Conversational Assistants
ACL 2021
MiRANews: Dataset and Benchmarks for Multi-Resource-Assisted News Summarization
EMNLP 2021
What happens if you treat ordinal ratings as interval data? Human evaluations in NLP are even more under-powered than you think
EMNLP 2021
Conversational Assistants and Gender Stereotypes: Public Perceptions and Desiderata for Voice Personas
COLING 2020
SLURP: A Spoken Language Understanding Resource Package
EMNLP 2020
History for Visual Dialog: Do we really need it?
ACL 2020
Fact-based Content Weighting for Evaluating Abstractive Summarisation
ACL 2020
RankME: Reliable Human Ratings for Natural Language Generation
NAACL 2018
Better Conversations by Modeling, Filtering, and Optimizing for Coherence and Diversity
EMNLP 2018
A Knowledge-Grounded Multimodal Search-Based Conversational Agent
EMNLP 2018
#MeToo Alexa: How Conversational Systems Respond to Sexual Harassment
NAACL 2018
Why We Need New Evaluation Metrics for NLG
EMNLP 2017
iLab-Edinburgh at SemEval-2016 Task 7: A Hybrid Approach for Determining Sentiment Intensity of Arabic Twitter Phrases
SEMEVAL 2016
Natural Language Generation enhances human decision-making with uncertain information
ACL 2016
From the Virtual to the RealWorld: Referring to Objects in Real-World Spatial Scenes
EMNLP 2015
Benchmarking Machine Translated Sentiment Analysis for Arabic Tweets
NAACL 2015
Cluster-based Prediction of User Ratings for Stylistic Surface Realisation
EACL 2014
Optimising Incremental Dialogue Decisions Using Information Density for Interactive Systems
CONLL 2012
Optimising Incremental Dialogue Decisions Using Information Density for Interactive Systems
EMNLP 2012
Optimising Information Presentation for Spoken Dialogue Systems
ACL 2010
Natural Language Generation as Planning Under Uncertainty for Spoken Dialogue Systems
EACL 2009
Learning Effective Multimodal Dialogue Strategies from Wizard-of-Oz Data: Bootstrapping and Evaluation
ACL 2008
Using Machine Learning to Explore Human Multimodal Clarification Strategies
ACL 2006
Using Machine Learning to Explore Human Multimodal Clarification Strategies
COLING 2006
Implications for Generating Clarification Requests in Task-Oriented Dialogues
ACL 2005