TFCSG: An Unsupervised Approach for Question-retrieval Over Multi-task Learning

2023 62ND ANNUAL CONFERENCE OF THE SOCIETY OF INSTRUMENT AND CONTROL ENGINEERS, SICE(2023)

引用 0|浏览0
暂无评分
摘要
Most Question-answering (QA) systems rely on training data to reach their optimal performance. However, acquiring training data for supervised systems is both time-consuming and resource-intensive. To address this, in this paper, we propose TFCSG, an unsupervised similar question retrieval approach that leverages pre-trained language models and multi-task learning. Firstly, topic keywords in question sentences are extracted sequentially based on a latent topic-filtering algorithm to construct unsupervised training corpus data. Then, the multi-task learning method is used to build the question retrieval model. There are three tasks designed. The first is a short sentence contrastive learning task. The second is the question sentence and its corresponding topic sequence similarity judgment task. The third is using question sentences to generate their corresponding topic sequence task. The three tasks are used to train the language model in parallel. Finally, similar questions are obtained by calculating the cosine similarity between sentence vectors. The comparison experiment on public question datasets that TFCSG outperforms the comparative unsupervised baseline method. And there is no need for manual marking, which greatly saves human resources.
更多
查看译文
关键词
question-retrieval,multi-task learning,topic model,contrastive learning,transfer learning,sentence representation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要