FP-AMG: FPGA-Based Acceleration Framework for Algebraic Multigrid Solvers

2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)(2020)

引用 15|浏览50
暂无评分
摘要
Partial Differential Equations (PDEs) are fundamental to many real-world scientific computing applications and so their optimization has undergone decades of study. Algebraic multigrid (AMG) is one of the most well-known solvers, being widely adopted in High Performance Computing (HPC) due to its good scalability. Acceleration of AMG is known to be very challenging, due to the following reasons: (1) irregular computation patterns, (2) random memory access, and (3) a large number of kernels with various computation types. To the best of our knowledge, there is no prior work on FPGA-based acceleration of AMG. To tackle these challenges, we propose an efficient FPGA-based reconfigurable framework, called FP-AMG, for high-performance AMG calculation. In order to obtain full pipeline utilization, we propose a novel and scalable architecture that can be reused for all kernels in AMG. Given that AMG is strictly memory-bound, we propose algorithmic and architectural optimizations to ensure nearly ideal use of memory bandwidth. The efficiency of FP-AMG is evaluated with six well-known benchmarks on two FPGA devices: one with and one without high bandwidth memory (HBM). The experimental results are compared with a highly optimized Intel Xeon E5-2680-V4 implementation of the state-of-the-art HYPRE library. Our experiments show that FP-AMG can achieve average speedups of $ 2.5\times$ and $ 6.6\times$, for FPGAs without and with HBM, respectively.
更多
查看译文
关键词
memory bandwidth,architectural optimizations,memory-bound,pipeline utilization,FPGA-based reconfigurable framework,FP-AMG,Intel Xeon E5-2680-V4 implementation,scientific computing applications,partial differential equations,algebraic multigrid solvers,FPGA-based acceleration framework,high bandwidth memory,high-performance AMG calculation,random memory access,high performance computing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要