When LLMs Read Tables Carelessly: Measuring and Reducing Data Referencing Errors

Yuqing Yang; Qi Zhu; Zhen Han; Boran Han; Zhengyuan Shen; Shuai Wang; Vassilis N. Ioannidis; Huzefa Rangwala

2026 ACL ACL 2026

When LLMs Read Tables Carelessly: Measuring and Reducing Data Referencing Errors

Abstract

AbstractWhile large language models (LLMs) perform well on table tasks, they still make data referencing errors (DREs), i.e., incorrectly citing or omitting table values, despite understanding the table structure. Beyond final-answer accuracy, DREs directly compromise the correctness and reliability of intermediate reasoning steps. Yet prior studies have only offered limited, small-scale analyses. In this work, we present the first systematic evaluation of tabular data referencing errors across different models and tasks. Our results show that DREs occur across all tested models (1.7B to 20B parameters). Furthermore, we demonstrate that incorporating data referencing as a critic significantly improves answer accuracy up to 12.0%, through critic-based filtering and rejection sampling. Finally, we trained a lightweight 4B-parameter critic model that achieves an average F1 score of 78.2% in detecting both in-distribution and out-of-distribution DREs, and effectively assists inference for larger models.

Authors

Yuqing Yang , Qi Zhu , Zhen Han , Boran Han , Zhengyuan Shen , Shuai Wang , Vassilis N. Ioannidis , Huzefa Rangwala

Topics

Artificial Intelligence > Core AI > Large Language Models Artificial Intelligence > Core AI > Evaluation

Keywords

table understanding rejection sampling critic model large language model data referencing error

Download PDF

Related papers

No Reader Left Behind: Multi-Agent Summaries Everyone Can Understand 2026

One-step Nonautoregressive Natural Language Generation with Shortcut Flow Matching Models 2026

Optimizing Retrieval-Augmented Generation for E-Commerce How-To Assistance 2026

Make Mechanistic Interpretability Auditable: A Call to Develop Guidelines via Continuous Collaborative Reviewing 2026

MQM Re-Annotation: A Technique for Collaborative Evaluation of Machine Translation 2026