Chrome Extension
WeChat Mini Program
Use on ChatGLM

InverseCoder: Self-improving Instruction-Tuned Code LLMs with Inverse-Instruct

Yutong Wu,Di Huang, Wenxuan Shi, Wei Wang, Lingzhe Gao, Shihao Liu,Ziyuan Nan,Kaizhao Yuan,Rui Zhang,Xishan Zhang,Zidong Du,Qi Guo,Yewen Pu, Dawei Yin,Xing Hu,Yunji Chen

CoRR(2024)

Cited 0|Views51
Abstract
Recent advancements in open-source code large language models (LLMs) have been driven by fine-tuning on the data generated from powerful closed-source LLMs, which are expensive to obtain. This paper explores whether it is possible to use a fine-tuned open-source model to generate additional data to augment its instruction-tuning dataset. We make two observations: (1) A code snippet can serve as the response to different instructions. (2) Instruction-tuned code LLMs perform better at translating code into instructions than the reverse. Based on these observations, we propose Inverse-Instruct, a data augmentation technique that uses a fine-tuned LLM to generate additional instructions of code responses from its own training dataset. The additional instruction-response pairs are added to the original dataset, and a stronger code LLM can be obtained by fine-tuning on the augmented dataset. We empirically validate Inverse-Instruct on a range of open-source code models (e.g. CodeLlama-Python and DeepSeek-Coder) and benchmarks (e.g., HumanEval(+), MBPP(+), DS-1000 and MultiPL-E), showing it consistently improves the base models.
More
Translated text
PDF
Bibtex
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper

要点】:本文提出了Inverse-Instruct数据增强技术,通过使用已指令调优的开源代码大语言模型生成额外的指令-代码片段对,从而提升模型性能。

方法】:作者观察到代码片段可以作为不同指令的响应,并且指令调优的代码LLM在将代码转换为指令方面比反向操作表现更好,基于这些观察,提出了Inverse-Instruct方法。

实验】:通过在开源代码模型(例如CodeLlama-Python和DeepSeek-Coder)和基准测试(例如HumanEval(+)、MBPP(+)、DS-1000和MultiPL-E)上实证验证Inverse-Instruct方法,结果表明该方法能够持续提高基础模型的性能。