Papers
Texts or Images? A Fine-grained Analysis on the Effectiveness of Input Representations and Models for Table Question Answering
Wei Zhou, Mohsen Mesgar, Heike Adel et al.
Text-to-ES Bench: A Comprehensive Benchmark for Converting Natural Language to Elasticsearch Query
Dongge Xue, Zhili Pu, Zhentao Xia et al.
Thapar Titan/s : Fine-Tuning Pretrained Language Models with Contextual Augmentation for Mistake Identification in Tutor–Student Dialogues
Harsh Dadwal, Sparsh Rastogi, Jatin Bedi
That doesn’t sound right: Evaluating speech transcription quality in field linguistics corpora
Eric Le Ferrand, Bo Jiang, Joshua Hartshorne et al.
That is Unacceptable: the Moral Foundations of Canceling
Soda Marem Lo, Oscar Araque, Rajesh Sharma et al.
The 2025 ReproNLP Shared Task on Reproducibility of Evaluations in NLP: Overview and Results
Anya Belz, Craig Thomson, Javier González Corbelle et al.
The 2nd Automated Verification of Textual Claims (AVeriTeC) Shared Task: Open-weights, Reproducible and Efficient Systems
Mubashara Akhtar, Rami Aly, Yulong Chen et al.
The AI Gap: How Socioeconomic Status Affects Language Technology Interactions
Elisa Bassignana, Amanda Cercas Curry, Dirk Hovy
The Alternative Annotator Test for LLM-as-a-Judge: How to Statistically Justify Replacing Human Annotators with LLMs
Nitay Calderon, Roi Reichart, Rotem Dror
The Anatomy of Evidence: An Investigation Into Explainable ICD Coding
Katharina Beckh, Elisa Studeny, Sujan Sai Gannamaneni et al.
The Art of Tool Interface Design
Yunnan Wu, Qile P. Chen, Deshank Baranwal et al.
The Behavior Gap: Evaluating Zero-shot LLM Agents in Complex Task-Oriented Dialogs
Avinash Baidya, Kamalika Das, Xiang Gao
The ClimateCheck Dataset: Mapping Social Media Claims About Climate Change to Corresponding Scholarly Articles
Raia Abu Ahmad, Aida Usmanova, Georg Rehm
The ClimateCheck Shared Task: Scientific Fact-Checking of Social Media Claims about Climate Change
Raia Abu Ahmad, Aida Usmanova, Georg Rehm
The Cross-linguistic Role of Animacy in Grammar Structures
Nina Gregorio, Matteo Gay, Sharon Goldwater et al.
The Distracting Effect: Understanding Irrelevant Passages in RAG
Chen Amiraz, Florin Cuconasu, Simone Filice et al.
The Effectiveness of Uncased Tokeniziaion for Clinical Notes
Cory Paik, Katharina Von Der Wense
The Efficiency vs. Accuracy Trade-off: Optimizing RAG-Enhanced LLM Recommender Systems Using Multi-Head Early Exit
Huixue Zhou, Hengrui Gu, Zaifu Zhan et al.
The Elephant in the Room: Exploring the Role of Neutral Words in Language Model Group-Agnostic Debiasing
Xinwei Guo, Jiashi Gao, Junlei Zhou et al.
The Esethu Framework: Reimagining Sustainable Dataset Governance and Curation for Low-Resource Languages
Jenalea Rajab, Anuoluwapo Aremu, Everlyn Asiko Chimoto et al.
The Essence of Contextual Understanding in Theory of Mind: A Study on Question Answering with Story Characters
Chulun Zhou, Qiujing Wang, Mo Yu et al.
The Evolution of Gen Alpha Slang: Linguistic Patterns and AI Translation Challenges
Ishita, Radhika Mamidi
“The Facts Speak for Themselves”: GPT and Fallacy Classification
Erisa Bytyqi, Annette Hautli-Janisz
The Fellowship of the LLMs: Multi-Model Workflows for Synthetic Preference Optimization Dataset Generation
Samee Arif, Sualeha Farid, Abdul Hameed Azeemi et al.