TTE: Two Tokens are Enough to Improve Parameter-Effcient Tuning

摘要

Existing fne-tuning paradigms are predominantly characterized by Full Parameter Tuning (FPT) and Parameter-Effcient Tuning (PET). FPT fne-tunes all parameters of a pre-trained model on downstream tasks, whereas PET freezes the pretrained model and employs only a minimal number of learnable parameters for fne-tuning. However, both approaches face issues of overftting, especially in scenarios where downstream samples are limited. This issue has been thoroughly explored in FPT, but less so in PET. To this end, this paper investigates overftting in PET, representing a pioneering study in the feld. Specifcally, across 19 image classifcation datasets, we employ three classic PET methods (e.g., VPT, Adapter/Adaptformer, and LoRA) and explore various regularization techniques to mitigate overftting. Regrettably, the results suggest that existing regularization techniques are incompatible with the PET process and may even lead to performance degradation. Consequently, we introduce a new framework named TTE (Two Tokens are Enough), which effectively alleviates overftting in PET through a novel constraint function based on the learnable tokens. Experiments conducted on 24 datasets across image and few-shot classifcation tasks demonstrate that our fne-tuning framework not only mitigates overftting but also signifcantly enhances PET’s performance. Notably, our TTE framework surpasses the highest-performing FPT framework (DR-Tune), utilizing signifcantly fewer parameters (0.15M vs. 85.84M) and achieving an improvement of 1%.

出版物
The Thirty-Ninth AAAI Conference on Artificial Intelligence (AAAI-25)
阮佳程
阮佳程
2022级博士生
谢铭烨
谢铭烨
2019级博士生

Life itself is the most wonderful fairy tales.

高景盛
高景盛
2020级博士生
高显
高显
2023级博士生
向孙程
向孙程
2017级博士生
刘婷
刘婷
讲师
付宇卓
付宇卓
教授 博士生导师