Maximizing Performance Through Memory Hierarchy-Driven Data Layout Transformations

2022 IEEE/ACM Workshop on Memory Centric High Performance Computing (MCHPC)(2022)

引用 0|浏览12
暂无评分
摘要
Computations on structured grids using standard multidimensional array layouts can incur substantial data movement costs through the memory hierarchy. This paper explores the benefits of using a framework (Bricks) to separate the complexity of data layout and optimized communication from the functional representation. To that end, we provide three novel contributions and evaluate them on several kernels taken from GENE, a phase-space fusion tokamak simulation code. We extend Bricks to support 6-dimensional arrays and kernels that operate on complex data types, and integrate Bricks with cuFFT. We demonstrate how to optimize Bricks for data reuse, spatial locality, and GPU hardware utilization achieving up to a 2.67 × speedup on a single A100 GPU. We conclude with insights on how to rearchitect memory subsystems.
更多
查看译文
关键词
Stencil,Memory Layout Optimization,High Dimensional Computations
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要