Liberator: A Data Reuse Framework for Out-of-Memory Graph Computing on GPUs

Shiyang Li,Ruiqi Tang, Jingyu Zhu,Ziyi Zhao,Xiaoli Gong,Wenwen Wang,Jin Zhang,Pen-Chung Yew

IEEE Transactions on Parallel and Distributed Systems（2023）

引用 0|浏览12

暂无评分

摘要

Graph analytics are widely used including recommender systems, scientific computing, and data mining. Meanwhile, GPU has become the major accelerator for such applications. However, the graph size increases rapidly and often exceeds the GPU memory, incurring severe performance degradation due to frequent data transfers between the main memory and GPUs. To relieve this problem, we focus on the utilization of data in GPUs by taking advantage of the data reuse across iterations. In our studies, we deeply analyze the memory access patterns of graph applications at different granularities. We have found that the memory footprint is accessed with a roughly sequential scan without a hotspot, which infers an extremely long reuse distance. Based on our observation, we propose a novel framework, called Liberator , to exploit the data reuse within GPU memory. In Liberator , GPU memory is reserved for the data potentially accessed across iterations to avoid excessive data transfer between the main memory and GPUs. For the data not existing in GPU memory, a Merged and Aligned memory access manner is employed to improve the transmission efficiency. We also further optimize the framework by parallel processing of data in GPU memory and data in the main memory. We have implemented a prototype of the Liberator framework and conducted a series of experiments on performance evaluation. The experimental results show that Liberator can significantly reduce the data transfer overhead, which achieves an average of 2.7x speedup over a state-of-the-art approach.

查看译文

关键词

Data reuse,GPU memory oversubscription,graph computing,partition-based method,zero-copy

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要