A Category-integrated Language Model for Question Retrieval in Community Question Answering.

AIRS(2012)

引用 8|浏览48
暂无评分
摘要
Community Question Answering (CQA) services have accumulated large archives of question-answer pairs, which are usually organized into a hierarchy of categories. To reuse the invaluable resources, it's essential to develop effective Question Retrieval (QR) models to retrieve similar questions from CQA archives given a queried question. This paper studies the integration of category information of questions into the unigram Language Model (LM). Specifically, a novel Category-integrated Language Model (CLM) is proposed which views category-specific term saliency as the Dirichlet hyper-parameter that weights the parameters of LM. A point-wise divergence based measure is introduced to compute a term's category-specific term saliency. Experiments conducted on a real world dataset from Yahoo! Answers show that the proposed CLM which integrates the category information into LM internally at the word level can significantly outperform the previous work that incorporates the category information into LM externally at the word level or at the document level. © Springer-Verlag 2012.
更多
查看译文
关键词
category,category-integrated language model,community question answering,question retrieval
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要