Skip to yearly menu bar Skip to main content


Poster
in
Workshop: SCOPE: SCALABLE OPTIMIZATION FOR EFFICIENT AND ADPATIVE FOUNDATION MODELS

AsymLoRA: Unlocking the Power of Multimodal LLMs via Asymmetric LoRA

Xuyang Wei · Chunlin Tian · Li Li

Keywords: [ Asymmetric LoRA ] [ Efficient fine-tuning ] [ Multimodal LLMs ]


Abstract:

Effective instruction fine-tuning on diverse image-text datasets is crucial for developing a versatile Multimodal Large Language Model (MLLM), where dataset composition dictates the model’s adaptability across multimodal tasks. However, complex datasets often contain inherent conflicts—stemming from modality-specific optimization objectives—and latent commonalities that enable cross-task transfer, which most existing approaches handle separately. To bridge this gap, we introduce AsymLoRA, a parameter-efficient tuning framework that unifies knowledge modularization and cross-modal coordination via asymmetric LoRA: task-specific low-rank projections (matrix B) that preserve distinct adaptation pathways for conflicting objectives, and a shared projection (matrix A) that consolidates cross-modal commonalities. Extensive evaluations demonstrate that AsymLoRA consistently surpasses both vanilla LoRA, which captures only commonalities, and LoRA-MoE, which focuses solely on conflicts, achieving superior model performance and system efficiency across diverse benchmarks.

Chat is not available.