Verena Rieser

51 papers · 2005–2025 · 11 conferences · across top CS/AI conferences

Achievements

+14 more ↓

🏃 Academic Marathon (20) 🌍 Conference Polyglot (11) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐝 Cross-Pollinator (9)

🌈 Renaissance Researcher (8) 🌍 Conference Polyglot (11) 🏃 Academic Marathon (20) 👥 Mega-Team (42) 🤝 Dynamic Duo (15) 🔬 Deep Specialist (10) 🧬 Topic Evolution 🏆 Keyword Champion (2) 📈 Trend Setter 🗃️ Keyword Collector (174) ⚡ Prolific Year (11) 🔥 Unstoppable (6) 💎 Century Club (51) ❓ The Questioner (7)

Conferences

ACL (16) EMNLP (15) IJCNLP (4) COLING (3) EACL (3) NAACL (3) SEMEVAL (3) AACL (1) CONLL (1) ICLR (1) IJCAI (1)

Top co-authors

Gavin Abercrombie (15) Ioannis Konstas (14) Oliver Lemon (10) Ondřej Dušek (9) Amanda Cercas Curry (8) Tanvi Dinkar (6) Xinnuo Xu (5) Dirk Hovy (3) David M. Howcroft (3) Dimitra Gkatzia (3)

Keywords

conversational ai (7) human evaluation (5) text generation (4) natural language generation (4) conversational assistant (3) human-ai interaction (3) text classification (3) open-domain dialogue (3) multimodal learning (3) sentiment analysis (2) dialogue system (2) human annotation (2) abstractive summarization (2) commonsense reasoning (2) safety evaluation (2) conversational agent (2) language model (2) prompt engineering (2) natural language processing (2) crowdsourced annotation (2)

Papers

Consistency is Key: Disentangling Label Variation in Natural Language Processing with Intra-Annotator Agreement EMNLP 2025 CulturalFrames: Assessing Cultural Expectation Alignment in Text-to-Image Models and Evaluation Metrics EMNLP 2025 Value Profiles for Encoding Human Variation EMNLP 2025 Century: A Framework and Dataset for Evaluating Historical Contextualisation of Sensitive Images ICLR 2025 STAR: SocioTechnical Approach to Red Teaming Language Models EMNLP 2024 ReproHum #0927-03: DExpert Evaluation? Reproducing Human Judgements of the Fluency of Generated Text COLING 2024 The Dangers of trusting Stochastic Parrots: Faithfulness and Trust in Open-domain Conversational Question Answering ACL 2023 Adversarial Textual Robustness on Visual Dialog ACL 2023 iLab at SemEval-2023 Task 11 Le-Wi-Di: Modelling Disagreement or Modelling Perspectives? ACL 2023 SemEval-2023 Task 11: Learning with Disagreements (LeWiDi) ACL 2023 Missing Information, Unresponsive Authors, Experimental Flaws: The Impossibility of Assessing the Reproducibility of Previous Human Evaluations in NLP EACL 2023 Resources for Automated Identification of Online Gender-Based Violence: A Systematic Review ACL 2023 iLab at SemEval-2023 Task 11 Le-Wi-Di: Modelling Disagreement or Modelling Perspectives? SEMEVAL 2023 Quality-agnostic Image Captioning to Safely Assist People with Vision Impairment IJCAI 2023 Mirages. On Anthropomorphism in Dialogue Systems EMNLP 2023 Multitask Multimodal Prompted Training for Interactive Embodied Task Completion EMNLP 2023 SemEval-2023 Task 11: Learning with Disagreements (LeWiDi) SEMEVAL 2023 Risk-graded Safety for Handling Medical Queries in Conversational AI AACL 2022 SafetyKit: First Aid for Measuring Safety in Open-domain Conversational Systems ACL 2022 Risk-graded Safety for Handling Medical Queries in Conversational AI IJCNLP 2022 Alexa, Google, Siri: What are Your Pronouns? Gender and Anthropomorphism in the Design and Perception of Conversational Assistants IJCNLP 2021 ConvAbuse: Data, Analysis, and Benchmarks for Nuanced Abuse Detection in Conversational AI EMNLP 2021 OTTers: One-turn Topic Transitions for Open-Domain Dialogue IJCNLP 2021 AggGen: Ordering and Aggregating while Generating IJCNLP 2021 AggGen: Ordering and Aggregating while Generating ACL 2021 OTTers: One-turn Topic Transitions for Open-Domain Dialogue ACL 2021 Alexa, Google, Siri: What are Your Pronouns? Gender and Anthropomorphism in the Design and Perception of Conversational Assistants ACL 2021 MiRANews: Dataset and Benchmarks for Multi-Resource-Assisted News Summarization EMNLP 2021 What happens if you treat ordinal ratings as interval data? Human evaluations in NLP are even more under-powered than you think EMNLP 2021 Conversational Assistants and Gender Stereotypes: Public Perceptions and Desiderata for Voice Personas COLING 2020 SLURP: A Spoken Language Understanding Resource Package EMNLP 2020 History for Visual Dialog: Do we really need it? ACL 2020 Fact-based Content Weighting for Evaluating Abstractive Summarisation ACL 2020 RankME: Reliable Human Ratings for Natural Language Generation NAACL 2018 Better Conversations by Modeling, Filtering, and Optimizing for Coherence and Diversity EMNLP 2018 A Knowledge-Grounded Multimodal Search-Based Conversational Agent EMNLP 2018 #MeToo Alexa: How Conversational Systems Respond to Sexual Harassment NAACL 2018 Why We Need New Evaluation Metrics for NLG EMNLP 2017 iLab-Edinburgh at SemEval-2016 Task 7: A Hybrid Approach for Determining Sentiment Intensity of Arabic Twitter Phrases SEMEVAL 2016 Natural Language Generation enhances human decision-making with uncertain information ACL 2016 From the Virtual to the RealWorld: Referring to Objects in Real-World Spatial Scenes EMNLP 2015 Benchmarking Machine Translated Sentiment Analysis for Arabic Tweets NAACL 2015 Cluster-based Prediction of User Ratings for Stylistic Surface Realisation EACL 2014 Optimising Incremental Dialogue Decisions Using Information Density for Interactive Systems CONLL 2012 Optimising Incremental Dialogue Decisions Using Information Density for Interactive Systems EMNLP 2012 Optimising Information Presentation for Spoken Dialogue Systems ACL 2010 Natural Language Generation as Planning Under Uncertainty for Spoken Dialogue Systems EACL 2009 Learning Effective Multimodal Dialogue Strategies from Wizard-of-Oz Data: Bootstrapping and Evaluation ACL 2008 Using Machine Learning to Explore Human Multimodal Clarification Strategies ACL 2006 Using Machine Learning to Explore Human Multimodal Clarification Strategies COLING 2006 Implications for Generating Clarification Requests in Task-Oriented Dialogues ACL 2005