ParaSuite: Boosting LLM Reasoning via Paradox Resolution

Bin Chen; Yu Zhang; Hongfei Ye; Huiyang Wang; Wenxi Liu; Hongyang Chen

2026 ACL ACL 2026

ParaSuite: Boosting LLM Reasoning via Paradox Resolution

Abstract

AbstractLogical reasoning is a key capability of large language models, yet current benchmarks focus almost entirely on tasks that just check basic logical consistency and overlook the reflective reasoning required for paradox detection and resolution. To fill the gap, we present ParaSuite, the first pipeline dedicated to paradox research that automates data synthesis, evaluation, and training. We introduce PARADOX, a synthetic, high-quality data spanning two difficulty tiers and three academic domains, accompanied by specialized evaluation metrics and solving algorithms. We propose ParadoxBreaker-7B, trained with Mutual-Information Guided Fine-Tuning and reinforcement learning step verify paradox reward(PAPO). Experiments demonstrate significant improvements in both paradoxical and general STEM reasoning.

Authors

Bin Chen , Yu Zhang , Hongfei Ye , Huiyang Wang , Wenxi Liu , Hongyang Chen

Topics

Artificial Intelligence > Core AI > Large Language Models Artificial Intelligence > Core AI > Reasoning

Keywords

logical reasoning large language model stem reasoning paradox resolution

Download PDF

Related papers

No Reader Left Behind: Multi-Agent Summaries Everyone Can Understand 2026

One-step Nonautoregressive Natural Language Generation with Shortcut Flow Matching Models 2026

Optimizing Retrieval-Augmented Generation for E-Commerce How-To Assistance 2026

Make Mechanistic Interpretability Auditable: A Call to Develop Guidelines via Continuous Collaborative Reviewing 2026

MQM Re-Annotation: A Technique for Collaborative Evaluation of Machine Translation 2026