From Experts to Bases: Orthogonal Subspace Mixture for Continual Multimodal Instruction Tuning

Pei Chen; Xilai Wang; Shiqixu; Zejian Li; Lingyun Sun

2026 ACL ACL 2026

From Experts to Bases: Orthogonal Subspace Mixture for Continual Multimodal Instruction Tuning

Abstract

AbstractMultimodal Continual Instruction Tuning (MCIT) is essential for adapting Multimodal Large Language Models (MLLMs) to dynamic data streams, yet preventing catastrophic forgetting remains a major challenge. Existing parameter-efficient approaches often face a dilemma: fixed architectures suffer from knowledge interference, while dynamic strategies incur inefficient capacity expansion, limiting scalability. We propose MoBLoRA (Mixture-of-Bases LoRA), a novel framework for MCIT. Motivated by our geometric analysis revealing subspace redundancy across sequential tasks, MoBLoRA shifts the paradigm from expert selection to subspace mixing: it decomposes adaptation weights into a globally shared pool of orthonormal bases to capture task-invariant knowledge, and lightweight mixing matrices to encode task-specific variations. This design effectively decouples knowledge accumulation from task reconstruction. Experiments on standard benchmarks show MoBLoRA significantly outperforms state-of-the-art methods while maintaining superior parameter efficiency.

Authors

Pei Chen , Xilai Wang , Shiqixu , Zejian Li , Lingyun Sun

Topics

Deep Learning > Learning Types > Multi-Modal Learning Deep Learning > Learning Types > Continual Learning Deep Learning > Learning Types > Parameter-Efficient Fine-Tuning

Keywords

catastrophic forgetting multimodal large language model subspace mixing multimodal continual instruction tuning

Download PDF

Related papers

No Reader Left Behind: Multi-Agent Summaries Everyone Can Understand 2026

One-step Nonautoregressive Natural Language Generation with Shortcut Flow Matching Models 2026

Optimizing Retrieval-Augmented Generation for E-Commerce How-To Assistance 2026

Make Mechanistic Interpretability Auditable: A Call to Develop Guidelines via Continuous Collaborative Reviewing 2026

MQM Re-Annotation: A Technique for Collaborative Evaluation of Machine Translation 2026