Query Rewriting in Retrieval-Augmented Large Language Models.

CoRR(2023)

引用 31|浏览202
暂无评分
摘要
Large Language Models (LLMs) play a powerful \textit{Reader} of the \textit{Retrieve-then-Read} pipeline, making great progress in knowledge-based open-domain tasks. This work introduces a new framework, \textit{Rewrite-Retrieve-Read} that improves the retrieval-augmented method from the perspective of the query rewriting. Prior studies mostly contribute to adapt the retriever or stimulate the reader. Different from them, our approach pay attention of the query adaptation. Because the original query can not be always optimal to retrieve for the LLM, especially in the real world.(1) We first prompt an LLM to rewrite the queries, then conduct retrieval-augmented reading. (2) We further apply a small language model as a trainable rewriter, which rewrite the search query to cater to the frozen retriever and the LLM reader. To fine-tune the rewriter, we first use a pseudo data to conduct supervised warm-up training. Then the \textit{Retrieve-then-Read} pipeline is modeled as a reinforcement learning context. The rewriter is further trained as a policy model by maximize the reward of the pipeline performance. Evaluation is performed on two downstream tasks, open-domain QA and multiple choice. Our framework is proved effective and scalable.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要