Too Long, Do Re-weighting for Efficient LLM Reasoning Compression

Zhong-Zhi Li; Xiao Liang; Zihao Tang; Lei Ji; Peijie Wang; Haotian Xu; Xing W; Haizhen Huang; Weiwei Deng; Yeyun Gong; Zhijiang Guo; Xiao Liu; Fei Yin; Cheng-lin Liu

2026 ACL ACL 2026

Too Long, Do Re-weighting for Efficient LLM Reasoning Compression

Abstract

AbstractLarge Language Models (LLMs) have recently achieved remarkable progress on complex reasoning tasks by leveraging extended Chain-of-Thought (CoT) techniques. These reasoning processes can be roughly categorized into System-1 (fast and intuitive) and System-2 (slow and deliberate) paradigms. However, excessive reliance on lengthy System-2-style reasoning during inference can produce extremely long outputs, thereby reducing efficiency. In this work, we propose Thinking Length Data Re-weighting (TLDR), that does not rely on sophisticated data annotations or interpolation between multiple models. We continuously balance the weights between the model’s System-1 and System-2 data to eliminate redundant reasoning processes while preserving the model’s reasoning capability. We validate our method across multiple base models, including Deepseek-R1-Distilled Qwen models, as well as on a diverse benchmarks with varying difficulty levels. Our method significantly reduces the number of output tokens by nearly 40% while maintaining the accuracy of the reasoning.

Authors

Zhong-Zhi Li , Xiao Liang , Zihao Tang , Lei Ji , Peijie Wang , Haotian Xu , Xing W , Haizhen Huang , Weiwei Deng , Yeyun Gong , Zhijiang Guo , Xiao Liu , Fei Yin , Cheng-lin Liu

Topics

Artificial Intelligence > Core AI > Large Language Models Artificial Intelligence > Core AI > Reasoning Deep Learning > Learning Types > Chain-of-Thought Reasoning

Keywords

chain-of-thought reasoning reasoning compression data re-weighting

Download PDF

Related papers

No Reader Left Behind: Multi-Agent Summaries Everyone Can Understand 2026

One-step Nonautoregressive Natural Language Generation with Shortcut Flow Matching Models 2026

Optimizing Retrieval-Augmented Generation for E-Commerce How-To Assistance 2026

Make Mechanistic Interpretability Auditable: A Call to Develop Guidelines via Continuous Collaborative Reviewing 2026

MQM Re-Annotation: A Technique for Collaborative Evaluation of Machine Translation 2026