JG2Time: A Learned Time Estimator for Join Operators Based on Heterogeneous Join-Graphs

Hao Miao,Jiazun Chen, Yang Lin, Mo Xu,Yinjun Han,Jun Gao

Database Systems for Advanced Applications(2023)

引用 0|浏览17
The join operator is one of the key operators in RDBMS, and estimating its evaluation time is a fundamental task in query optimization, scheduling, etc. However, it is hard to make a precise estimation, which is not only related with the physical join implementations (hash, sort, loop) but also with the corresponding parameters, like the size of the data, the number of partitions, the number of threads in a modern hash join. Existing works rely on the time complexity analysis but yield rough results, or employ machine learning techniques to build a predictive model but require many training instances. In this paper, we propose a method, named JG2Time, to estimate the running time using the join-graphs constructed from the source codes. Specifically, we construct a heterogonous join-graph by annotating parameter nodes to a call-graph generated by running time analysis tools, and propose ReGAT, a heterogonous graph neural network, to fully capture the edge weights (the number of function calls) in the join-graph. The embeddings learned from ReGAT can be used to predict the running time. In addition, we optimize JG2Time with a multi-task model that also predicts the times of function calls, and an unsupervised code learning method to enhance its generalization. The experimental results illustrate the effectiveness of JG2Time and its optimization strategies.
running time estimation of join operator, call-graph, heterogonous join-graph, heterogeneous graph neural network
AI 理解论文
Chat Paper