On Identifying Phrases Using Collection Statistics.

ECIR(2015)

引用 1|浏览32
暂无评分
摘要
The use of phrases as part of similarity computations can enhance search effectiveness. But the gain comes at a cost, either in terms of index size, if all word-tuples are treated as queryable objects; or in terms of processing time, if postings lists for phrases are constructed at query time. There is also a lack of clarity as to which phrases are “interesting”, in the sense of capturing useful information. Here we explore several techniques for recognizing phrases using statistics of large-scale collections, and evaluate their quality.
更多
查看译文
关键词
Mutual Information, Query Time, Methyl Ether Tertiary Butyl, Stop Word, Inverted Index
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要