CEDAR: A Chinese Evaluation Dataset for Computational Argumentation

Tian Lan; Jiang Li; Rong Yan; Feilong Bao; Weihua Wang; Guanglai Gao; Xiangdong Su

2026 ACL ACL 2026

CEDAR: A Chinese Evaluation Dataset for Computational Argumentation

Abstract

AbstractComputational argumentation has received increasing attention in recent years. However, existing debate datasets neglect some important labels for argument mining, generation, and evaluation. Meanwhile, the lack of comprehensively annotated Chinese oral debate datasets hinders progress in this field. To address these gaps, we introduce a comprehensive Chinese Evaluation Dataset for Computational Argumentation, named CEDAR. Compared to previous datasets, CEDAR includes the essential labels of computational argumentation (claim, stance, evidence) and five additional crucial labels: rhetorical figures, debater roles, modal words, utterance time, and debate results. Moreover, it offers complete transcripts of each debate, including speeches from the Pro and Con sides. Thus, the proposed CEDAR not only supports common argument mining and generation tasks, but also provides resources for rhetorical figure detection, argument quality evaluation, and debate result prediction. This dataset covers 600 debates about 318 topics from Chinese debate competitions. Besides providing a dataset for research, we conduct experiments on common computational argument tasks and a novel task (rhetorical figure detection), in which we also evaluate LLMs. The experimental results highlight the challenging nature of the dataset. Our corpus is available at https://github.com/VelikayaScarlet/CEDAR.

Authors

Tian Lan , Jiang Li , Rong Yan , Feilong Bao , Weihua Wang , Guanglai Gao , Xiangdong Su

Topics

Natural Language Processing > Applications > Information Extraction Natural Language Processing > Applications > Argument Mining Natural Language Processing > Applications > Evaluation

Keywords

argument mining computational argumentation argument generation rhetorical figure debate result prediction

Download PDF

Related papers

No Reader Left Behind: Multi-Agent Summaries Everyone Can Understand 2026

One-step Nonautoregressive Natural Language Generation with Shortcut Flow Matching Models 2026

Optimizing Retrieval-Augmented Generation for E-Commerce How-To Assistance 2026

Make Mechanistic Interpretability Auditable: A Call to Develop Guidelines via Continuous Collaborative Reviewing 2026

MQM Re-Annotation: A Technique for Collaborative Evaluation of Machine Translation 2026