LoRA-S: An Efficient Low Rank Adaptation scheme via Sylvester equation
Jinyang ZHENG · Tong Wu
Abstract
Numerous studies on low-rank adaptation (LoRA) emerged in recent years, with the aim of accelerating the convergence of the LoRA framework. In this paper, we leverage the horizontal lift theory from differential geometry to establish the general iteration scheme on the quotient manifold \mathbb{R}\_\*^{m \times r} \times \mathbb{R}\_\*^{n \times r}/\sim. By endowing the LoRA framework with Riemannian quotient geometries, our theory not only guarantees efficient feature learning but also bridges the LoRA algorithms and the pre-training algorithms for large models. Furthermore, we theoretically analyze the role of the weight decay matrix $\epsilon_{decay}I$ in efficient feature learning and then replace it with the Sylvester matrix $K$, indicating that the theory helps remove an important hyperparameter while generating accurate and computationally efficient optimizers. Based on the general scheme, we propose two efficient LoRA optimizers with runtime analysis, Adam-Sylvester (AdamS) and LRACS, then conduct experiments on the transformer-based networks. The results demonstrate evident improvements over existing optimizers.
Successful Page Load