Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Mathematical and Empirical Understanding of Foundation Models (ME-FoMo)

Empirical Analysis of the Strengths and Weaknesses of PEFT Techniques for LLMs

George Pu · Anirudh Jain · Jihan Yin · Russell Kaplan

Keywords: [ PEFT convergence ] [ PEFT benchmarks ] [ Parameter-Efficient Fine-Tuning (PEFT) ] [ large language models ] [ PEFT optimization ]


Abstract:

As foundation models continue to exponentially scale in size, efficient methods of adaptation become increasingly critical. Parameter-efficient fine-tuning (PEFT), a recent class of techniques which require only modifying a small percentage of the model parameters, is currently the most popular method for adapting large language models (LLMs). Several PEFT techniques have recently been proposed with varying tradeoffs. We provide a comprehensive and uniform benchmark of various PEFT techniques across a representative LLM, the FLAN-T5 model, and evaluate model performance across different data scales of classification and generation datasets. Based on this, we provide a framework for choosing the optimal PEFT techniques, based on task type and data availability. Contrary to popular belief, we also empirically prove that PEFT techniques converge slower and perform worse than full fine-tuning in low data scenarios, and posit the amount of data required for PEFT methods to both perform well and converge efficiently. Lastly, we further optimize these PEFT techniques by selectively choosing which parts of the model to train, and find that these techniques can be applied to significantly fewer parameters while maintaining model performance.

Chat is not available.