ArgGenBench: Benchmarking the Complex Controlled Argument Generation Capability of Large Language Models

Bojun Jin; Jianzhu Bao; Yang Sun; Yice Zhang; Ruifeng Xu

2026 ACL ACL 2026

ArgGenBench: Benchmarking the Complex Controlled Argument Generation Capability of Large Language Models

Abstract

AbstractArgument generation is a fundamental NLP task that aims to automatically produce persuasive arguments.Effective human argumentation is inherently complex and multifaceted, integrating argumentative strategies, appropriate styles, and adaptation to target audiences, etc.However, existing studies focus on limited control signals such as topic, stance, or key aspects, failing to capture this complexity.As LLMs advance, the lack of benchmarks evaluating multifaceted argumentative control becomes a critical bottleneck.To address this, we introduce ArgGenBench, a novel benchmark containing complex instructions that integrate multi-dimensional control, including topic, stance, length, style, strategy, audience, and key points.Extensive evaluation across 15 LLMs reveals significant limitations: even the best-performing model achieves only 42.7% win rate against human-verified references.These results highlight the challenge of controlled argument generation and establish ArgGenBench as a rigorous testbed for developing more capable systems.

Authors

Bojun Jin , Jianzhu Bao , Yang Sun , Yice Zhang , Ruifeng Xu

Topics

Natural Language Processing > Generation > Text Generation Natural Language Processing > Applications > Argument Mining Artificial Intelligence > Core AI > Evaluation

Keywords

persuasive argument controlled argument generation multi-dimensional control

Download PDF

Related papers

No Reader Left Behind: Multi-Agent Summaries Everyone Can Understand 2026

One-step Nonautoregressive Natural Language Generation with Shortcut Flow Matching Models 2026

Optimizing Retrieval-Augmented Generation for E-Commerce How-To Assistance 2026

Make Mechanistic Interpretability Auditable: A Call to Develop Guidelines via Continuous Collaborative Reviewing 2026

MQM Re-Annotation: A Technique for Collaborative Evaluation of Machine Translation 2026