MFBE: Leveraging Multi-Field Information of FAQs for Efficient Dense Retrieval
arXiv (Cornell University)(2023)
摘要
In the domain of question-answering in NLP, the retrieval of Frequently Asked
Questions (FAQ) is an important sub-area which is well researched and has been
worked upon for many languages. Here, in response to a user query, a retrieval
system typically returns the relevant FAQs from a knowledge-base. The efficacy
of such a system depends on its ability to establish semantic match between the
query and the FAQs in real-time. The task becomes challenging due to the
inherent lexical gap between queries and FAQs, lack of sufficient context in
FAQ titles, scarcity of labeled data and high retrieval latency. In this work,
we propose a bi-encoder-based query-FAQ matching model that leverages multiple
combinations of FAQ fields (like, question, answer, and category) both during
model training and inference. Our proposed Multi-Field Bi-Encoder (MFBE) model
benefits from the additional context resulting from multiple FAQ fields and
performs well even with minimal labeled data. We empirically support this claim
through experiments on proprietary as well as open-source public datasets in
both unsupervised and supervised settings. Our model achieves around 27
20
datasets, respectively over the best performing baseline.
更多查看译文
关键词
faqs,efficient dense,multi-field
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要