conftrace_
2026 ACL ACL 2026

ParaSuite: Boosting LLM Reasoning via Paradox Resolution

Abstract

AbstractLogical reasoning is a key capability of large language models, yet current benchmarks focus almost entirely on tasks that just check basic logical consistency and overlook the reflective reasoning required for paradox detection and resolution. To fill the gap, we present ParaSuite, the first pipeline dedicated to paradox research that automates data synthesis, evaluation, and training. We introduce PARADOX, a synthetic, high-quality data spanning two difficulty tiers and three academic domains, accompanied by specialized evaluation metrics and solving algorithms. We propose ParadoxBreaker-7B, trained with Mutual-Information Guided Fine-Tuning and reinforcement learning step verify paradox reward(PAPO). Experiments demonstrate significant improvements in both paradoxical and general STEM reasoning.