Query-Driven Indexing in Large-Scale Distributed Systems: Efficient query processing with distributed indexes

Query-Driven Indexing in Large-Scale Distributed Systems: Efficient query processing with distributed indexes(2009)

引用 23|浏览7
暂无评分
摘要
Efficient and effective search in large-scale data repositories requires complex indexing solutions deployed on a large number of servers. Commercial Web search engines already rely upon complex systems to be able to return relevant query results and keep processing times within the comfortable sub-second limit. Nevertheless, the exponential growth of the amount of content on the Web poses serious challenges with respect to scalability. Coping with these challenges requires novel indexing solutions that not only remain scalable but also preserve the search accuracy. In this work we introduce and explore the concept of query-driven indexing - an index construction strategy that uses caching techniques to adapt to the querying patterns expressed by users. We suggest to abandon the strict difference between indexing and caching, and to build a distributed indexing structure, or a distributed cache, such that it is optimized for the current query load. Our experimental and theoretical analysis shows that employing query-driven indexing is especially beneficial when the content is (geographically) distributed in a Peer-to-Peer network.
更多
查看译文
关键词
query-driven indexing,effective search,current query load,search accuracy,novel indexing solution,efficient query processing,complex system,complex indexing solution,relevant query result,indexing structure,Query-Driven Indexing,Commercial Web search engine
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要