Thorough Data Pruning for Join Query in Database System

IEEE transactions on sustainable computing(2023)

引用 0|浏览2
暂无评分
摘要
The improvement of robustness and efficiency for multi-way equijoin query is challenging, no-matter for centralized database systems or distributed database systems. Due to lots of unnecessary data existing during query processing, these two metrics will be seriously reduced. If we can thoroughly prune unnecessary data in advance, the robustness and efficiency will be highly improved. However, the pruning power of current strategies, such as predicate push-down and algebraic equivalence, is limited. We present deepDP, a powerful, generalized, and efficient strategy for data pruning. deepDP builds multiple independent pruning spaces by generating longest transitive closures and applies appropriate data pruning strategy for each pruning space. For thoroughly pruning unnecessary data, deepDP employs $\alpha \cdot \beta$ pruning strategy to clean each pruning space based on a newly designed statistic information-Hollow Range and re-shuffles the elements in all pruned spaces for maximizing robustness and efficiency benefits meanwhile minimizing the invasion. We implement deepDP in PostgreSQL but are not limited to it, and evaluate deepDP on TPC-H, JOB, and our synthesis benchmark–DHR. The experiment results show that compared to traditional data pruning strategy, deepDP can improve multi-way equijoin query on efficiency by 3.5x.
更多
查看译文
关键词
Database System,Join Query,Data Pruning,Hollow Range,Query Optimization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要