MinPrompt: Graph-based Minimal Prompt Data Augmentation for Few-shot Question Answering
CoRR(2023)
摘要
Recent advances in few-shot question answering (QA) mostly rely on the power
of pre-trained large language models (LLMs) and fine-tuning in specific
settings. Although the pre-training stage has already equipped LLMs with
powerful reasoning capabilities, LLMs still need to be fine-tuned to adapt to
specific domains to achieve the best results. In this paper, we propose to
select the most informative data for fine-tuning, thereby improving the
efficiency of the fine-tuning process with comparative or even better accuracy
on the open-domain QA task. We present MinPrompt, a minimal data augmentation
framework for open-domain QA based on an approximate graph algorithm and
unsupervised question generation. We transform the raw text into a graph
structure to build connections between different factual sentences, then apply
graph algorithms to identify the minimal set of sentences needed to cover the
most information in the raw text. We then generate QA pairs based on the
identified sentence subset and train the model on the selected sentences to
obtain the final model. Empirical results on several benchmark datasets and
theoretical analysis show that MinPrompt is able to achieve comparable or
better results than baselines with a high degree of efficiency, bringing
consistent improvements in F-1 scores.
更多查看译文
关键词
minimal minprompt data augmentation,graph-based,few-shot
AI 理解论文
溯源树
样例
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要