MiSS: Revisiting the Trade-off in LoRA with an Efficient Shard-Sharing Structure
Jiale Kang · Qingyu Yin
Abstract
Low-Rank Adaptation (LoRA) is a widely adopted technique for parameter-efficient fine-tuning, but its slow convergence has spurred the development of numerous variants. Nevertheless, current approaches struggle to achieve simultaneous improvements in performance, memory footprint, and computational efficiency. To address this challenge, we revisit the causes of LoRA’s slow convergence and, based on these insights, propose \textbf{M}atr\textbf{i}x \textbf{S}hard \textbf{S}haring (MiSS) that shards the original weight matrix and updates by sharing a single trainable matrix $\boldsymbol{D}$ initialized to zero. To simultaneously ensure computational efficiency, low memory footprint, and scalable serving, we introduce MiSS$^e$. Through theoretical analyses and empirical results, our method reduces optimization complexity while maintaining strong performance, striking a favorable balance between performance, memory, and efficiency. Furthermore, we provide a comprehensive analysis of different PEFT methods with respect to memory usage, initialization time, and computational efficiency. By mapping the Pareto frontier, we show that MiSS achieves a favorable balance across these dimensions, integrating the strengths of prior approaches.
Successful Page Load