Know the Known and the Unknown: Reasonable Answer Generation with Knowledge-Informed Citations

Yichi Zhang; Zhuo Chen; Lingbing Guo; Jun Xu; Mengshu Sun; Zhizhen Liu; Lei Liang; Wen Zhang; Huajun Chen

2026 ACL ACL 2026

Know the Known and the Unknown: Reasonable Answer Generation with Knowledge-Informed Citations

Abstract

AbstractQuestion answering (QA) with reference texts is a classic application scenario for large language models (LLMs), where high standards for the credibility and traceability of generated answers are crucial. Many existing approaches focus on generating multi-level citations linked to specific references within the answer, making it verifiable and trustworthy. However, they often overlook key challenges such as citation granularity, the awareness of unknown information, and the adoption of effective training strategies. In this paper, we introduce Knowledge-informed Citation (KFC), which addresses these issues through a novel data construction pipeline, a new benchmark, and an innovative training strategy. With approximately 42K samples spanning 19 distinct domains, KFC includes both traditional citations referencing known entity-level information and specialized citations referring to unknown knowledge in the given question. This structure provides a more granular approach to citations, guiding the model to recognize and explicitly indicate unknown information, thus enhancing the quality and credibility of the response. Additionally, we propose a self-correction paradigm, Self-KFC, designed to fine-tune LLMs by refining poorly cited answers into more accurate ones, making it particularly suitable for citation-dependent scenarios. We present comprehensive experimental results to demonstrate the effectiveness and generalization of Self-KFC on the KFC benchmark.

Authors

Yichi Zhang , Zhuo Chen , Lingbing Guo , Jun Xu , Mengshu Sun , Zhizhen Liu , Lei Liang , Wen Zhang , Huajun Chen

Topics

Natural Language Processing > Applications > Question Answering Artificial Intelligence > Core AI > Large Language Models Artificial Intelligence > Core AI > Question Answering

Keywords

question answering answer generation knowledge-informed citation self-correction paradigm citation granularity

Download PDF

Related papers

No Reader Left Behind: Multi-Agent Summaries Everyone Can Understand 2026

One-step Nonautoregressive Natural Language Generation with Shortcut Flow Matching Models 2026

Optimizing Retrieval-Augmented Generation for E-Commerce How-To Assistance 2026

Make Mechanistic Interpretability Auditable: A Call to Develop Guidelines via Continuous Collaborative Reviewing 2026

MQM Re-Annotation: A Technique for Collaborative Evaluation of Machine Translation 2026