Optimizing Union All Join Queries In Teradata

2017 IEEE 33RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2017)(2017)

引用 3|浏览15
暂无评分
摘要
The UNION ALL set operator is useful for combining data from multiple sources. With the emergence of big data ecosystems in which data is typically stored on multiple systems, UNION ALL has become even more important. In this paper, we present optimization techniques implemented in Teradata Database for join queries with UNION ALL. Instead of spooling all branches of UNION ALL before performing join operations, we propose cost-based pushing of joins into branches. Join pushing not only addresses the prohibitive cost of spooling all branches, but also helps in exposing more efficient join methods (e.g., direct hash-based joins) which, otherwise, would not be considered by the query optimizer. The geography of relations being pushed to UNION ALL branches is also adjusted to avoid unnecessary redistributions and duplications of data. We conclude the paper with a performance study that demonstrates the impact of the proposed optimization techniques on query performance.
更多
查看译文
关键词
UNION ALL set operator,Big Data ecosystems,optimization techniques,UNION ALL join query optimization,teradata database,data duplications,query performance
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要