A systematic process for efficient execution on Intel's heterogeneous computation nodes

XSEDE '12: Proceedings of the 1st Conference of the Extreme Science and Engineering Discovery Environment: Bridging from the eXtreme to the campus and beyond(2012)

引用 0|浏览3
暂无评分
摘要
Heterogeneous architectures (mainstream CPUs with accelerators/co-processors) are expected to become more prevalent in high performance computing clusters. This paper deals specifically with attaining efficient execution on nodes which combine Intel's multicore Sandy Bridge chips with MIC manycore chips. The architecture and software stack for Intel's heterogeneous computation nodes attempt to make migration from the now common multicore chips to the many-core chips straightforward. However, specific execution characteristics are favored by these manycore chips such as making use of the wider vector instructions, minimal inter-thread conflicts, etc. Additionally manycore chips have lower clock speed and no unified last-level cache. As a result, and as we demonstrate in this paper, it will commonly be the case that not all parts of an application will execute more efficiently on the manycore chip than on the multicore chip. This paper presents a process, based on measurements of execution on Westmere-based multicore chips, which can accurately predict which code segments will execute efficiently on the manycore chips and illustrates and evaluates its application to three substantial full programs -- HOMME, MOIL and MILC. The effectiveness of the process is validated by verifying scalability of the specific functions and loops that were recommended for MIC execution on a Knights Ferry computation node.
更多
查看译文
关键词
heterogeneous computation node,multicore sandy bridge chip,multicore chip,manycore chip,paper deal,common multicore chip,systematic process,efficient execution,mic execution,specific execution characteristic,westmere-based multicore chip,mic manycore chip,heterogeneous computing,chip
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要