EndGraph: An Efficient Distributed Graph Preprocessing System

Tianfeng Liu,Dan Li

2022 IEEE 42nd International Conference on Distributed Computing Systems (ICDCS)(2022)

引用 0|浏览15
暂无评分
摘要
Graph processing mainly includes two stages, namely, preprocessing and algorithm execution. Most previous proposals for performance enhancement of graph processing systems focus on the algorithm execution stage, and simple ignore the preprocessing overhead. However, in this work, we argue that the cost of preprocessing can not be ignored since the preprocessing time is much longer than the algorithm execution time in state-of-the-art systems.We propose EndGraph, a distributed graph preprocessing system, to improve preprocessing performance. Firstly, for graph partitioning, we find existing systems either assign imbalanced preprocessing workloads or spend too much time on graph partitioning. Hence, EndGraph proposes a novel chunk-based partition algorithm to balance preprocessing workloads and achieve theoretical lower bound of time complexity. Secondly, for graph construction (converting data layout from edge array to adjacency list), existing systems use counting sort, which is not efficient for computation and communication. EndGraph employs a novel two-level graph construction method by carefully decoupling the graph construction into intra-machine and inter-machine construction. Our extensive evaluation results show that, compared with five state-of-the-art systems, LFGraph, PowerLyra, PowerGraph, D-Galois, and Gemini, EndGraph can improve the preprocessing performance up to 35.76 ×(from 4.72×). To show the generality of EndGraph, we integrate it with D-Galois and Gemini, and it improves the end-to-end (including preprocessing and algorithm execution) graph processing performance up to 7.44× (from 2.96×).
更多
查看译文
关键词
Graph processing,Distributed,Preprocessing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要