Using GPU Shared Memory with a Directive-Based Approach

IPDPS Workshops(2014)

引用 4|浏览46
暂无评分
摘要
Graphic Processing Units (GPUs) have been increasingly adopted by the High-Performance Computing community. Its unique hardware architecture supports hundreds or housands of light-weighted threads in a more power efficient manner compared with traditional CPUs, and with higher overall performance. This motivates highly parallel applications to be ported to GPUs. Programming GPUs is not a trivial task in particular for programmers familiar with X86-like architectures. CUDA and OpenCL are two low-level programming APIs which are designed to ease the GPU programming. Unfortunately, the resultant GPU codes greatly depart from traditional codes in both syntax and structure, making code hard to maintain. In order to keep the original code structure, directive-based programming models have been developed (OpenACC, HMPP, etc). In such programming models, the code is augmented with directives (as when using OpenMP) to guide the compiler to generate CUDA/OpenCL code automatically. To optimize performance, code restructuring is needed to make full and specific use of the GPU hardware advantages, e.g. GPU shared memory. In this paper, we explore various directive-based approaches to port a well-known Oil and Gas industry algorithm (Reverse Time Migration, or RTM) to GPUs while trying to balance code portability and performance maximization. Our HMPP implementation achieves 85% performance of the highly optimized version of CUDA result at the time of this work in the summer of 2013.
更多
查看译文
关键词
graphic processing units,hybrid multicore parallel programming,gpu,shared memory,cuda,parallel programming,reverse time migration,parallel architectures,graphics processing units,hmpp implementation,shared memory systems,code portability,oil-and-gas industry algorithm,rtm,directive-based programming models,directive-based,gpu shared memory,rtm, cuda, shared memory, directive-based, gpu,programming,kernel,computer architecture,imaging,instruction sets
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要