AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
The vector space model has long been used as a basic framework for developing new retrieval methods

Improving text retrieval for the routing problem using latent semantic indexing

SIGIR, pp.282-291, (1994)

引用296|浏览49
EI
下载 PDF 全文
引用
微博一下

摘要

Latent Semantic Indexing (LSI) is a novel approach to information retrieval that attempts to model the underlying structure of term associations by transforming the traditional representation of documents as vectors of weighted term frequencies to a new coordinate space where both documents and terms are represented as linear combinations...更多

代码

数据

0
简介
  • The vector space model (VSM) [1], which measures the similarity between the query and each document by the weighted inner product of overlapping terms, has long been a standard in information retrieval.

    The VSM has its flaws, since it ignores both the order and association between terms, but it is hard to find a better method with an equivalent computational complexity.
  • The method reduces the full term-document matrix to a small number of information-rich LSI vectors, which can be used in a traditional retrieval model or as the basis for more advanced statistical classification algorithms.
  • The goal is to find the relevant documents in a new collection or the remaining relevant documents in the collection that the sample is drawn from
  • This task is equivalent to the routing problem used for system evaluation at the TREC retrieval conference [3].
  • One can imagine this task as the second stage in a retrieval algorithm in place of the the traditional strategy of relevance feedback [4]
重点内容
  • The vector space model (VSM) [1], which measures the similarity between the query and each document by the weighted inner product of overlapping terms, has long been a standard in information retrieval
  • We address the issue of whether Latent Semantic Indexing improves performance when applied to the routing task
  • We examine an alternative application of Latent Semantic Indexing that can be used in conjunction with statistical classification to obtain a significant improvement in retrieval performance
  • The vector space model has long been used as a basic framework for developing new retrieval methods
结果
  • If the retrieval strategy does not improve performance for the routing task, it will not produce good results for query-based information retrieval.
  • The authors' experiments will provide evidence that LSI slightly improves performance for the routing task.
  • The authors address the issue of whether LSI improves performance when applied to the routing task
结论
  • The vector space model has long been used as a basic framework for developing new retrieval methods.
  • It is difficult to devise a retrieval strategy that performs better with an equivalent amount of computation.
  • The vector space model has some significant problems
  • It assumes that terms are independent and ignores term associations.
  • Latent Semantic Indexing addresses this problem by re-expressing the term-document matrix in a new coordinate system designed to capture the most significant components of the term association structure
引用论文
  • Gerard Salton, editor. The SMART ing. Prentice-Hall, 1971.
    Google ScholarFindings
  • S. Deerwester, S. Dumais, G. Furna.s, T. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of the Amertcan Soctety for Information Sczence, 41(6):391–407, 1990.
    Google ScholarLocate open access versionFindings
  • Donna Harman. Overview ence, pages 36–47, 1993.
    Google ScholarFindings
  • Gerard Salton and Christopher Improving retrieval performance by relevance Journal of the Amertcan Society for Information
    Google ScholarLocate open access versionFindings
  • Sczence, 41(4):288-297, 1990.
    Google ScholarFindings
  • 5. Yonggang Qiu and H.P. Frei. Concept Conference, pages 160-169, 1993.
    Google ScholarLocate open access versionFindings
  • 6. Hinrich 1992. Dimensions of meaning. In Proceedings of Supercomputzng
    Google ScholarLocate open access versionFindings
  • Conference, pages 107-115, 1993.
    Google ScholarFindings
  • 8. J. Friedman, J. Bentley, and R. Finkel. An algorithm for finding best matches in logarithmic time. ACM Transactions on Mathematical Soflware, 3(3):209-226, 1977.
    Google ScholarLocate open access versionFindings
  • 9. G. Furnas, S. Deerwester, S. Dumais, T. Landauer, R. Harshman, Information retrieval using a singular value decomposition model Proc. of the llth ACM/SIGIR Conference, pages 465-480, 1988.
    Google ScholarLocate open access versionFindings
  • 10. B.T. Bartell, G.W. Cottrell, and R.K. Belew. Latent semantic indexing is an optimal special case of multidimensional scaling. In Proc. of the 15th A CM/SIGIR Conference, pages 161–167, 1992. Processing and Management, Term-weighting 24(5):513-523, approaches 1988.
    Google ScholarLocate open access versionFindings
  • 13. M. Berry. Large scale singular cations, 6(1):13–49, 1992. International Journal of Supercomputer
    Google ScholarLocate open access versionFindings
  • 14. David Hull. Using statistical testing in the evaluation Conference, pages 329-338, 1993.
    Google ScholarFindings
  • 15. Donna Harman. 1-1o, 1993.
    Google ScholarFindings
  • 16. Geoffrey 341-346. J. McLachlan. Wiley, 1992.
    Google ScholarLocate open access versionFindings
  • Conference, pages 202–210, 1991.
    Google ScholarFindings
  • Conference, pages 18-25, 1985.
    Google ScholarFindings
作者
0
您的评分 :

暂无评分

标签
评论
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn