GraNDe: Efficient Near-Data Processing Architecture for Graph Neural Networks

Sungmin Yun,Hwayong Nam,Jaehyun Park,Byeongho Kim,Jung Ho Ahn,Eojin Lee

IEEE Transactions on Computers（2023）

引用 0|浏览8

暂无评分

摘要

Graph Neural Network (GNN) models have attracted attention, given their high accuracy in interpreting graph data. One of the primary building blocks of a GNN model is aggregation, which gathers and averages the feature vectors corresponding to the nodes adjacent to each node. Aggregation works by multiplying the adjacency and feature matrices. The size of both matrices exceeds the on-chip cache capacity for many realistic datasets, and the adjacency matrix is highly sparse. These characteristics lead to little data reuse, causing intensive main-memory accesses during the aggregation process. Thus, aggregation exhibits memory-intensive characteristics and dominates most of the total execution time. In this paper, we propose GraNDe, an NDP architecture that accelerates memory-intensive aggregation operations by locating NDP modules near DRAM datapath to exploit rank-level parallelism. GraNDe maximizes bandwidth utilization by separating the memory channel path with the buffer chip in between so that pre-/post-processing in the host processor and reduction in NDP modules operate simultaneously. By exploring the preferred data mappings of the operand matrices to DRAM ranks, we architect GraNDe to support adaptive matrix mapping that applies the optimal mapping for each layer depending on the dimension of the layer and the configuration of a memory system. We also propose adj-bundle broadcasting and re-tiling optimizations to reduce the transfer time for adjacency matrix data and to improve feature vector data reusability by exploiting tiling with consideration of adjacency between nodes. GraNDe achieves 3.01× and 1.69× on average, and up to

$4.00\times$

and

$1.98\times$

speedups of GCN aggregation over the baseline system and the state-of-the-art NDP architecture for GCN, respectively.

查看译文

关键词

Near-data processing,DRAM,graph neural networks

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要