Adaptive Spatial and Temporal Redundancy Optimization for Efficient Reasoning in Large Language Models

Tianle Chen; Pengyu Cheng; Qiyuan Zhu; Jiacheng Wang; Bei Liu; Hao Gu; Ruijie Shen; Xiaofeng Hou; Sirui Han; Jiacheng Liu

2026 ACL ACL 2026

Adaptive Spatial and Temporal Redundancy Optimization for Efficient Reasoning in Large Language Models

Abstract

AbstractLarge Language Models (LLMs) have achieved exceptional performance in complex reasoning via Chain-of-Thought (CoT), yet the associated computational costs remain prohibitive. CoT reasoning contains significant untapped efficiency potential across two dimensions: temporal redundancy, where reasoning steps may be unnecessary, and spatial redundancy, where computations can be performed at reduced precision. While current optimization techniques often necessitate resource-intensive fine-tuning or data curation, we introduce ASTRO (Adaptive Spatial and Temporal Redundancy Optimization), a training-free framework that simultaneously addresses both dimensions. ASTRO leverages Dewey’s reflective thinking model to segment reasoning phases, applying a progressive precision reduction strategy coupled with an entropy-based confidence mechanism for adaptive termination. Empirical results across diverse reasoning benchmarks demonstrate that ASTRO achieves up to an 11.3 × efficiency gain without compromising accuracy, highlighting the advantages of holistic multi-dimensional redundancy management over isolated optimization methods.

Authors

Tianle Chen , Pengyu Cheng , Qiyuan Zhu , Jiacheng Wang , Bei Liu , Hao Gu , Ruijie Shen , Xiaofeng Hou , Sirui Han , Jiacheng Liu

Topics

Artificial Intelligence > Core AI > Large Language Models Deep Learning > Optimization & Theory > Efficient Computing Deep Learning > Learning Types > Chain-of-Thought Reasoning

Keywords

chain-of-thought reasoning computational efficiency redundancy reduction large language model adaptive precision entropy-based confidence

Download PDF

Related papers

No Reader Left Behind: Multi-Agent Summaries Everyone Can Understand 2026

One-step Nonautoregressive Natural Language Generation with Shortcut Flow Matching Models 2026

Optimizing Retrieval-Augmented Generation for E-Commerce How-To Assistance 2026

Make Mechanistic Interpretability Auditable: A Call to Develop Guidelines via Continuous Collaborative Reviewing 2026

MQM Re-Annotation: A Technique for Collaborative Evaluation of Machine Translation 2026