Skip to content

XiaoMi/EfficientFT

Repository files navigation

RUC & Xiaomi: Efficient Fine-Tuning 🙌🎉

📰 News

  • 2025-4-29: Our paper has been accepted by IJCAI-25. Congratulations!
  • 2025-3-31: Delivery of a Prototype System for Parameter-Efficient and Gradient Projection Methods: A Comprehensive Benchmark Against 10+ State-of-the-Art Efficient Fine-Tuning Approaches.
  • 2024-12-30: Theoretical Insights into Fine-Tuning Attention Mechanism.

🎯 Introduction and Target

(1) Our insights (paper, in progress):

According to the traditional statistical learning viewpoint, performance can be defined by the sum of optimization error and generalization error. In (generalization, storage-friendly), we give Theorem 1 (Information-theoretic genralization bounds), showing that with the same $r$ value, fine-tuning $\mathbf{W}_q,\mathbf{W}_v$ consistently achieves results comparable to or even surpassing those of fine-tuning $\mathbf{W}_q,\mathbf{W}_k,\mathbf{W}_v$. This reduces the number of parameters for the same $r$, while improving generalization bounds and potentially providing memory benefits. In (optimization, time-friendly), we discuss the learning dynamics in fine-tuning attention mechanism, and we illustrate Theorem 2 that the feature learning of attention mechanism is efficient when the learning rate for $\mathbf{W}_v$ should be generally much larger than that of $\mathbf{W}_q,\mathbf{W}_k$ in fine-tuning. Building on our experimental and theoretical insights, one can develop new algorithms to improve the effectiveness (e.g., storage, and time) of fine-tuning.

theorem1

theorem2

(2) Target:

$\text{\textcolor{blue}{This project conducts comprehensive benchmarking of the following 10+ efficient fine-tuning methods.}}$

Notably, our proposed approach maintains orthogonal compatibility and can be synergistically combined with any of these methods.

📖 10+ efficient fine-tuning methods

⚙️ Install

  1. To install the experiment, please install the pip file.
pip install -r requirements.txt
  1. (Optional) For SIFT&Galore
git clone [email protected]:song-wx/SIFT.git
cd SIFT
pip install .
pip install galore-torch

🚀 Quick Start

Get Dataset

data_download.py

Usage

  1. ensure execute permissions

    chmod +x xxx.sh  #xxx->your file name
    
  2. Full-Finetuning, LoRA, AdaLoRA, DoRa, PiSSA, rsLoRA, OLoRA, EVA, SIFT

    # choose the target method_name and modules.
    EfficientFT/sh/roberta-base-peft.sh 
    EfficientFT/sh/llama-peft.sh
    
  3. Galore.

    EfficientFT/sh/roberta_galore.sh
    

😊Some Results

res1

📝 Citation

@article{yao2024theoretical,
  title={Theoretical Insights into Fine-Tuning Attention Mechanism: Generalization and Optimization},
  author={Yao, Xinhao and Qian, Hongjin and Hu, Xiaolin and Xu, Gengze and Liu, Yong and Liu, Wei and Luan, Jian and Wang, Bin},
  journal={arXiv preprint arXiv:2410.02247},
  year={2024}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published