Exploiting locality in the run-time parallelization of irregular loops

ICPP(2002)

引用 16|浏览9
暂无评分
摘要
The goal of this work is the efficient parallel execution of loops with indirect array accesses, in order to be embedded in a parallelizing compiler framework. In this kind of loop pattern, dependences can not always be determined at compile-time as, in many cases, they involve input data that are only known at run-time and/or the access pattern is too complex to be analyzed. In this paper we propose run-time strategies for the parallelization of these loops. Our approaches focus not only on extracting parallelism among iterations of the loop, but also on exploiting data access locality to improve memory hierarchy behavior and, thus, theoverall program speedup. Two strategies are proposed: one based on graph partitioning techniques and other based on a block-cyclic distribution. Experimental results show that both strategies are complementary and the choice of the best alternative depends on some features of the loop pattern.
更多
查看译文
关键词
parallelising compilers,program control structures,access pattern,block-cyclic distribution,efficient parallel execution,graph partitioning techniques,indirect array accesses,loop pattern,memory hierarchy behavior,parallelizing compiler framework,program speedup
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要