Don’t Be Misled by Style: A Style-Adaptive Reranker for Capturing Effective Knowledge in Retrieval-Augmented Generation

Ruwen Zhang; Bo Liu; Zhang Sheng Xiang; Yida Chen; Hantao Zhao; Ding Ding; Jiahui Jin; Jiuxin Cao

2026 ACL ACL 2026

Don’t Be Misled by Style: A Style-Adaptive Reranker for Capturing Effective Knowledge in Retrieval-Augmented Generation

Abstract

AbstractRerankers are critical in Retrieval-Augmented Generation (RAG) for filtering evidence that enhances the accurate generation of LLMs. With the extension to open-domain scenarios, rerankers are inevitably deployed on mixed-style corpora, whereas most existing rerankers are mainly trained on well-edited texts. A rarely explored issue lies in enabling rerankers to maximally capture the effective knowledge for downstream LLMs without being misled by stylistic features. To address this issue, we propose SARK (Style-Adaptive Reranker with Knowledge Prioritization), a style-augmented multi-task framework that prioritizes effective knowledge over stylistic perturbations. SARK performs multi-granular knowledge mining by using an LLM to derive passage-level supervision on whether a passage helps or harms answer correctness, and list-level relative ranking preferences over candidate passages. It then jointly optimizes the reranker model with passage-level classification and list-level ranking objectives via style-augmented multi-task learning, encouraging the model to focus on the information needed for answering under mixed-style scenarios. Extensive experiments demonstrate that SARK improves generation performance across multiple LLMs under mixed-style conditions.

Authors

Ruwen Zhang , Bo Liu , Zhang Sheng Xiang , Yida Chen , Hantao Zhao , Ding Ding , Jiahui Jin , Jiuxin Cao

Topics

Natural Language Processing > Resources & Methods > Large Language Models Natural Language Processing > Generation > Retrieval-Augmented Generation Deep Learning > Learning Types > Multi-Task Learning

Keywords

multi-task learning retrieval-augmented generation text reranking large language model

Download PDF

Related papers

No Reader Left Behind: Multi-Agent Summaries Everyone Can Understand 2026

One-step Nonautoregressive Natural Language Generation with Shortcut Flow Matching Models 2026

Optimizing Retrieval-Augmented Generation for E-Commerce How-To Assistance 2026

Make Mechanistic Interpretability Auditable: A Call to Develop Guidelines via Continuous Collaborative Reviewing 2026

MQM Re-Annotation: A Technique for Collaborative Evaluation of Machine Translation 2026