Conceptual Search And Text Categorization

msra（2008）

引用 25|浏览43

暂无评分

摘要

The most fundamental problem in information retrieval is that of interpreting information needs of users, typically expressed in a short query. Using the surface level repre- sentation of the query is especially unsatisfactory when the information needs are topic speciflc such as \US politics" or \Space Science", that seem to require understanding of what the query mean rather than what it is. We suggest that a newly proposed semantic representa- tion of words (4) can be used to support Conceptual Search. Namely, it allows retrieving documents on a given topic even when existing keyword-based search approaches fail. The method we develop allows us to categorize and retrieve doc- uments topically on-the-∞y, without looking at the data col- lection ahead of time, without knowing a-priori the topics of interest and without training topic categorization classiflers. We compare our approach experimentally to state-of-the- art IR techniques and to machine learning based text cate- gorization techniques and demonstrate signiflcant improve- ment in performance. Moreover, as we show, our method is intrinsically adaptable to new text collections and domains.

查看译文

关键词

semantics,information retrieval,feature representation,machine learning,information need,natural language processing,col

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要