Performance Evaluation of Lattice Boltzmann Method for Fluid Simulation on A64FX Processor and Supercomputer Fugaku.

HPC Asia(2022)

引用 1|浏览0
暂无评分
摘要
The lattice Boltzmann method has recently become popular as an alternative to Navier-Stokes solvers for large-scale fluid simulations. We conduct a performance study of the lattice Boltzmann method on the A64FX Arm-based processor of the supercomputer Fugaku. We compared four types of data layouts: SoA, AoS, Clusterd SoA (CSoA), and CSoA2, and three algorithms for the LBM streaming step: Pull, Push, and Swap schemes. The performance measurement on a single CMG (Core Memory Group) shows that the combination of the CSoA2 layout and the Swap scheme has the highest performance of 176 GFLOP, which corresponds to 11.5% of the single-precision peak performance. Our simulations have demonstrated good weak scaling up to 16,384 nodes and achieved high performance 10.9 PFLOPS in single precision. The strong scalability is also a good result, with parallel efficiencies of 63.9%, 68.3% and 72.7 % for the D3Q15, D3Q19 and D3Q27 velocity model, respectively when scaling from 512 to 16,384 nodes.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要