DUCATI: A Dual-Cache Training System for Graph Neural Networks on Giant Graphs with the GPU.

Xin Zhang,Yanyan Shen,Yingxia Shao,Lei Chen

Proc. ACM Manag. Data（2023）

引用 2|浏览18

暂无评分

摘要

Recently Graph Neural Networks (GNNs) have achieved great success in many applications. The mini-batch training has become the de-facto way to train GNNs on giant graphs. However, the mini-batch generation task is extremely expensive which slows down the whole training process. Researchers have proposed several solutions to accelerate the mini-batch generation, however, they (1) fail to exploit the locality of the adjacency matrix, (2) cannot fully utilize the GPU memory, and (3) suffer from the poor adaptability to diverse workloads. In this work, we propose DUCATI, aDual-Cache system to overcome these drawbacks. In addition to the traditionalNfeat-Cache, DUCATI introduces a newAdj-Cache to further accelerate the mini-batch generation and better utilize GPU memory. DUCATI develops a workload-awareDual-Cache Allocator which adaptively finds the best cache allocation plan under different settings. We compare DUCATI with various GNN training systems on four billion-scale graphs under diverse workload settings. The experimental results show that in terms of training time, DUCATI can achieve up to 3.33 times speedup (2.07 times on average) compared to DGL and up to 1.54 times speedup (1.32 times on average) compared to the state-of-the-artSingle-Cache systems. We also analyze the time-accuracy trade-offs of DUCATI and four state-of-the-art GNN training systems. The analysis results offer users some guidelines on system selection regarding different input sizes and hardware resources.

查看译文

关键词

graph neural networks,giant graphs,dual-cache

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要