PILLOW: Enhancing Efficient Instruction Fine-tuning via Prompt Matching.
Conference on Empirical Methods in Natural Language Processing(2023)
摘要
Instruction fine-tuning has conventionally been employed to adapt Large
Language Models (LLMs) to a variety of tasks. Nonetheless, this technique often
necessitates substantial computational resources, making it impractical for
deployment by individuals or small-scale entities. Recently, Low-Rank
Adaptation (LoRA) has become a promising alternative, offering high
capabilities on par with full tuning with reduced resource overhead. However,
attaining satisfactory performance through the fine-tuning of LoRA is a
non-trivial challenge. In this paper, we propose PILLOW, which aims to improve
LoRA's performance by a discrimination-based prompting method, leveraging LLMs'
In-Context Learning ability. PILLOW incorporates a matching network that
selects prompts from a user-defined prompt pool, concatenates the selected
prompts with the user instruction as input, and performs inference using the
LoRA-fine-tuned LLMs. Trained with Reinforcement Learning, PILLOW exhibits
commensurate performance on various evaluation metrics compared with typical
instruction fine-tuning methods, utilizing only consumer-grade GPU resources
and exhibiting a large reduction in computational costs.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要