A Hierarchical Jacobi Iteration for Structured Matrices on GPUs using Shared Memory

2022 IEEE High Performance Extreme Computing Conference (HPEC)(2022)

引用 0|浏览11
暂无评分
摘要
This paper presents an algorithm to accelerate the Jacobi iteration for solving linear systems of equations arising from structured problems on graphics processing units (GPUs). Acceleration is achieved by utilization of on-chip GPU shared memory via a domain decomposition procedure. In particular, the problem domain is partitioned into subdomains whose data is copied to the shared memory of each GPU block. Jacobi iterations are performed internally within each block's shared memory while avoiding expensive global memory accesses every iteration, resulting in a hierarchical algorithm (which takes advantage of the GPU memory hierarchy). We investigate the algorithm performance on the linear systems arising from the discretization of Poisson's equation in 1D and 2D, and observe an 8x speedup in convergence in the 1D problem and a nearly 6x speedup in 2D compared to a conventional GPU implementation of Jacobi iteration which only relies on global memory.
更多
查看译文
关键词
hierarchical Jacobi iteration,structured matrices,GPUs,shared memory,linear systems,structured problems,on-chip GPU,domain decomposition procedure,GPU block,expensive global memory accesses every iteration,hierarchical algorithm,GPU memory hierarchy,algorithm performance
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要