Scaling Graph 500 SSSP to 140 Trillion Edges with over 40 Million Cores

Yuanwei Wang,Huanqi Cao,Zixuan Ma,Wanwang Yin,Wenguang Chen

SC22: International Conference for High Performance Computing, Networking, Storage and Analysis（2022）

Cited 3|Views71

No score

Abstract

The SSSP kernel was first introduced into the Graph 500 benchmark in 2017. However, there has been no result from a full-scale world-top supercomputer. The primary reason is the poor work-inefficiency of existing algorithms at large scales. In this paper, we propose an SSSP implementation for The Newest Generation Sunway Supercomputer,including an SSSP algorithm to achieve work-efficiency, along with an adaptive dense/sparse-mode selection approach to achieve communication-efficiency. Our implementation reaches 7638 GTEPS, with 103158 processors (over 40 million cores), and achieves 3.7× in performance and 512× in graph size compared with the current top one on the Graph 500 SSSP list. Based on our experience of running extreme-scale SSSP, we uncover the root cause of its poor scalability: the weight distribution allows edges with weights close to zero, making the SSSP tree deeper on larger graphs. We further explore a scalability-friendly weight distribution by setting a non-zero lower bound to the edge weights.

Translated text

Key words

Graphs,Benchmark testing,Scalability,Super-computers,Shortest path problem

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Chat Paper

Summary is being generated by the instructions you defined