DoRA: Weight-Decomposed Low-Rank Adaptation
CoRR(2024)
摘要
Among the widely used parameter-efficient finetuning (PEFT) methods, LoRA and
its variants have gained considerable popularity because of avoiding additional
inference costs. However, there still often exists an accuracy gap between
these methods and full fine-tuning (FT). In this work, we first introduce a
novel weight decomposition analysis to investigate the inherent differences
between FT and LoRA. Aiming to resemble the learning capacity of FT from the
findings, we propose Weight-Decomposed LowRank Adaptation (DoRA). DoRA
decomposes the pre-trained weight into two components, magnitude and direction,
for fine-tuning, specifically employing LoRA for directional updates to
efficiently minimize the number of trainable parameters. By employing DoRA, we
enhance both the learning capacity and training stability of LoRA while
avoiding any additional inference overhead. DoRA consistently outperforms LoRA
on fine-tuning LLaMA, LLaVA, and VL-BART on various downstream tasks, such as
commonsense reasoning, visual instruction tuning, and image/video-text
understanding.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要