Uncertainty-Aware Routing for Principled Alignment with MoE Dynamics

Yilong Chen; Junyuan Shang; Yuchen Feng; Zhenyu Zhang; Naibin Gu; Ziqi Wang; Tingwen Liu; Shuohuan Wang; Yu Sun; Hua Wu; Haifeng Wang

2026 ACL ACL 2026

Uncertainty-Aware Routing for Principled Alignment with MoE Dynamics

Abstract

AbstractMixture-of-Experts (MoE) is a cornerstone for scaling LLMs, yet its training dynamics remain poorly understood, often leading to sub-optimal specialization. Moving beyond static routing, we present a systematic study of the MoE lifecycle using Helmholtz Free Energyand Router Entropy. We identify a universal Three-Stage Phase Transition—Exploration, Symmetry Breaking, and Stabilization—marked by an Energy Climb and Plateau. This reflects Frustrated Exploration, caused by structural interference between specialization drives and uniformity constraints. To address this, we propose Uncertainty-Aware Routing (UAR), which aligns routing with the model’s epistemic state via: (1) Evidence-Triggered Expansion, increasing active experts for high-energy tokens, and (2) Epistemic Masking, applying load-balancing only in high-uncertainty regimes to shield mature experts. Experiments confirm UAR reduces perplexity and improves expert distinctiveness, offering a principled path toward thermodynamically aligned computation.

Authors

Yilong Chen , Junyuan Shang , Yuchen Feng , Zhenyu Zhang , Naibin Gu , Ziqi Wang , Tingwen Liu , Shuohuan Wang , Yu Sun , Hua Wu , Haifeng Wang

Topics

Artificial Intelligence > Bayesian & Probabilistic > Probabilistic Modeling Artificial Intelligence > Core AI > Large Language Models Artificial Intelligence > Core AI > Uncertainty Quantification

Keywords

uncertainty quantification phase transition mixture of expert expert routing

Download PDF

Related papers

No Reader Left Behind: Multi-Agent Summaries Everyone Can Understand 2026

One-step Nonautoregressive Natural Language Generation with Shortcut Flow Matching Models 2026

Optimizing Retrieval-Augmented Generation for E-Commerce How-To Assistance 2026

Make Mechanistic Interpretability Auditable: A Call to Develop Guidelines via Continuous Collaborative Reviewing 2026

MQM Re-Annotation: A Technique for Collaborative Evaluation of Machine Translation 2026