SEED: Customize Large Language Models with Sample-Efficient Adaptation for Code Generation
arxiv(2024)
摘要
Although Large Language Models (LLMs) have made significant progress in code
generation, they still struggle with code generation tasks in specific
scenarios. These scenarios usually necessitate the adaptation of LLMs to
fulfill specific needs, but the limited training data available in practice
leads to poor code generation performance. How to effectively adapt LLMs to new
scenarios with fewer training samples is a major challenge for current code
generation. In this paper, we propose a novel adaptation approach named SEED,
which stands for Sample-Efficient adaptation with Error-Driven learning for
code generation. SEED leverages the errors made by LLMs as learning
opportunities, using error revision to overcome its own shortcomings, thus
achieving efficient learning. Specifically, SEED involves identifying error
code generated by LLMs, employing Self-revise for code revision, optimizing the
model with revised code, and iteratively adapting the process for continuous
improvement. Experimental results show that, compared to traditional
fine-tuning approaches, SEED achieves superior performance with fewer training
samples, showing a relative improvement of 27.2
validate the effectiveness of Self-revise, which generates revised code that
optimizes the model more efficiently compared to the code samples from
datasets. Moreover, SEED consistently demonstrates strong performance across
various LLMs, underscoring its generalizability.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要