The Hilbert PDC-tree: A High-Velocity Structure for Many-Dimensional Data.

IDEAS(2016)

引用 2|浏览26
暂无评分
摘要
ABSTRACTFast aggregation of data with many dimensions is a key component of many applications. The R-tree is the traditional data structure for indexing multi-dimensional data, but even the best R-tree variants suffer from performance degradation as the number of dimensions increases. The DC-tree addressed this issue by replacing Minimum Bounding Rectangle (MBR) keys with Minimum Describing Subsets (MDSs), which are less susceptible to overlap. This technique dramatically improves query performance with many dimensions, but at the cost of reduced insertion performance. Like most R-tree variants, this insertion overhead comes from expensive geometric comparisons while selecting the best child for insertion, or splitting over-full nodes. DC-trees, including the parallel PDC-tree, suffer even more from this overhead since MDSs are typically much more expensive to compare and manipulate than MBRs. This paper introduces the Hilbert PDC-tree, a parallel index structure for many-dimensional data that supports high-velocity data ingestion. This is achieved by avoiding geometric comparisons during insertion by instead inserting records based on the Hilbert index of their keys. This approach is similar to that of the Hilbert R-tree, but with special considerations for efficiently supporting many hierarchical dimensions. Additionally, a new node splitting algorithm significantly reduces overlap and improves query performance. Experiments show that the Hilbert PDC-tree scales well to a high number of dimensions, while supporting a much higher rate of ingestion and better query performance than the PDC-tree.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要