Extending OpenMP to Facilitate Loop Optimization.
Lecture Notes in Computer Science(2018)
摘要
OpenMP provides several mechanisms to specify parallel source-code transformations. Unfortunately, many compilers perform these transformations early in the translation process, often before performing traditional sequential optimizations, which can limit the effectiveness of those optimizations. Further, OpenMP semantics preclude performing those transformations in some cases prior to the parallel transformations, which can limit overall application performance. In this paper, we propose extensions to OpenMP that require the application of traditional sequential loop optimizations. These extensions can be specified to apply before, as well as after, other OpenMP loop transformations. We discuss limitations implied by existing OpenMP constructs as well as some previously proposed (parallel) extensions to OpenMP that could benefit from constructs that explicitly apply sequential loop optimizations. We present results that explore how these capabilities can lead to as much as a 20% improvement in parallel loop performance by applying common sequential loop optimizations.
更多查看译文
关键词
Loop optimization,Loop chain abstraction,Heterogeneous adaptive worksharing,Memory transfer pipelining
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络