Papers
The Good, the Bad and the Constructive: Automatically Measuring Peer Review’s Utility for Authors
Abdelrahman Sadallah, Tim Baumgärtner, Iryna Gurevych et al.
The Good, the Bad, and the Debatable: A Survey on the Impacts of Data for In-Context Learning
Stephanie Schoch, Yangfeng Ji
The Green KNIGHT: Green Machine Translation with Knowledge-Distilled, Narrow, Inexpensive, Greedy, Hybrid Transformers
Andreas Guta, Frithjof Petrick, Peter Polák
The Hallucination Tax of Reinforcement Finetuning
Linxin Song, Taiwei Shi, Jieyu Zhao
The Illusion of Progress: Re-evaluating Hallucination Detection in LLMs
Denis Janiak, Jakub Binkowski, Albert Sawczyn et al.
The Illusion of Randomness: How LLMs Fail to Emulate Stochastic Decision-Making in Rock-Paper-Scissors Games?
Zihao Guo, Hongtao Lv, Chaoli Zhang et al.
The Impact of Language Mixing on Bilingual LLM Reasoning
Yihao Li, Jiayi Xin, Miranda Muqing Miao et al.
The Impact of Negated Text on Hallucination with Large Language Models
Jaehyung Seo, Hyeonseok Moon, Heuiseok Lim
The iRead4Skills Intelligent Complexity Analyzer
Wafa Aissa, Raquel Amaro, David Antunes et al.
The Kyrgyz Seed Dataset Submission to the WMT25 Open Language Data Initiative Shared Task
Murat Jumashev, Alina Tillabaeva, Aida Kasieva et al.
The Language of Interoception: Examining Embodiment and Emotion Through a Corpus of Body Part Mentions
Sophie Wu, Jan Philip Wahle, Saif M. Mohammad
The LLM Already Knows: Estimating LLM-Perceived Question Difficulty via Hidden Representations
Yubo Zhu, Dongrui Liu, Zecheng Lin et al.
The Lookahead Limitation: Why Multi-Operand Addition is Hard for LLMs
Tanja Baeumel, Josef Van Genabith, Simon Ostermann
The Medium Is Not the Message: Deconfounding Document Embeddings via Linear Concept Erasure
Yu Fan, Yang Tian, Shauli Ravfogel et al.
The Missing Parts: Augmenting Fact Verification with Half Truth Detection
Yixuan Tang, Jincheng Wang, Anthony Kum Hoe Tung
The More, The Better? A Critical Study of Multimodal Context in Radiology Report Summarization
Mong Yuan Sim, Wei Emma Zhang, Xiang Dai et al.
Theorem-Validated Reverse Chain-of-Thought Problem Generation for Geometric Reasoning
Deng Linger, Linghao Zhu, Yuliang Liu et al.
The Power of Framing: How News Headlines Guide Search Behavior
Amrit Poudel, Maria Milkowski, Tim Weninger
The Practical Impacts of Theoretical Constructs on Empathy Modeling
Allison Lahnala, Charles Welch, David Jurgens et al.
The Price of Format: Diversity Collapse in LLMs
Longfei Yun, Chenyang An, Zilong Wang et al.
The Progress Illusion: Revisiting meta-evaluation standards of LLM evaluators
Tianruo Rose Xu, Vedant Gaur, Liu Leqi et al.
The Prompt Makes the Person(a): A Systematic Evaluation of Sociodemographic Persona Prompting for Large Language Models
Marlene Lutz, Indira Sen, Georg Ahnert et al.
The Psychology of Falsehood: A Human-Centric Survey of Misinformation Detection
Arghodeep Nandi, Megha Sundriyal, Euna Mehnaz Khan et al.
The Pursuit of Empathy: Evaluating Small Language Models for PTSD Dialogue Support
Suhas Bn, Yash Mahajan, Dominik O. Mattioli et al.