Modeling Document Lengths

Simulating Information Retrieval Test CollectionsSynthesis Lectures on Information Concepts, Retrieval, and Services(2020)

引用 0|浏览1
暂无评分
摘要
In some applications, very approximate modeling of the distribution of document lengths will suffice. However, there are many scenarios in which accurate modeling is desirable. For example, significant gains in retrieval effectiveness were achieved at TREC-3 through better normalization of document length [74, 81]. Those effects could not have been observed on a simulated collection in which all documents had the same length. Furthermore, distributions of term co-occurrences, distributions of TFs, and topic mixtures are all affected by the distribution of document lengths.
更多
查看译文
关键词
modeling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要