Small Agents, Big Gains: Journey-Aware and Critic-Guided Simulation for Long-Horizon Shopping Dialogues

Qing Ping; Changyou Chen; Binxuan Huang

2026 ACL ACL 2026

Small Agents, Big Gains: Journey-Aware and Critic-Guided Simulation for Long-Horizon Shopping Dialogues

Abstract

AbstractModern e-commerce assistants must go beyond simple product search to support inspiration, comparison, and tool-grounded fact-checking across non-linear shopping journeys. However, distilling these complex behaviors into efficient, deployable models is bottle-necked by a lack of post-training data: trajectories must cover diverse agentic workflows with high fidelity, yet the desired outputs are open-ended without a single ground truth. We propose a closed-loop Multi-Agent Simulation Framework to synthesize diverse, faithful, and policy-aligned shopping trajectories. The system orchestrates a journey-aware, stateful user simulator to drive exploration, a shopping agent that manages both tools and UI elements, and a critic agent that provides rubric-driven feedback to iteratively refine the data. On a domain-specific benchmark, this synthetic data enables a small model to significantly outperform same-size baselines and surpass a large-model baseline, achieving near-zero tool-calling errors with 8× higher inference throughput.

Authors

Qing Ping , Changyou Chen , Binxuan Huang

Topics

Artificial Intelligence > Core AI > Multi-Agent Systems Artificial Intelligence > Core AI > Reinforcement Learning Artificial Intelligence > Core AI > Dialogue Systems

Keywords

multi-agent simulation user simulator shopping dialogue critic-guided data refinement journey-aware simulation

Download PDF

Related papers

No Reader Left Behind: Multi-Agent Summaries Everyone Can Understand 2026

One-step Nonautoregressive Natural Language Generation with Shortcut Flow Matching Models 2026

Optimizing Retrieval-Augmented Generation for E-Commerce How-To Assistance 2026

Make Mechanistic Interpretability Auditable: A Call to Develop Guidelines via Continuous Collaborative Reviewing 2026

MQM Re-Annotation: A Technique for Collaborative Evaluation of Machine Translation 2026