On Identifying Phrases Using Collection Statistics.

Simon Gog,Alistair Moffat,Matthias Petri

ECIR（2015）

引用 1|浏览32

暂无评分

摘要

The use of phrases as part of similarity computations can enhance search effectiveness. But the gain comes at a cost, either in terms of index size, if all word-tuples are treated as queryable objects; or in terms of processing time, if postings lists for phrases are constructed at query time. There is also a lack of clarity as to which phrases are “interesting”, in the sense of capturing useful information. Here we explore several techniques for recognizing phrases using statistics of large-scale collections, and evaluate their quality.

查看译文

关键词

Mutual Information, Query Time, Methyl Ether Tertiary Butyl, Stop Word, Inverted Index

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要