谷歌浏览器插件
订阅小程序
在清言上使用

A study of the effects of machine geometry and mapping on distributed transpose performance.

CF '08: Computing Frontiers Conference Ischia Italy May, 2008(2008)

引用 1|浏览61
暂无评分
摘要
This paper describes a parallel strategy to extend the scalability of a small 3D FFT on thousands of Blue Gene/L processors. The approach is to execute the intermediate phases of the 3D FFT on smaller processor subsets. Performance measurements of the standalone 3D FFT on two communication protocols, MPI and BG/L ADE are presented. While the performance of the 3D-FFT with MPI-based and BG/L ADE-based implementations exhibited qualitatively similar behavior, the BG/L ADE-based version has lower communication cost than the MPI based version for small message sizes. Measurements also show that the proposed approach is effective in improving Particle-Mesh-based N-body simulation performance significantly at the limits of scalability.
更多
查看译文
关键词
transpose performance,machine geometry,mapping
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要