OCAL: An Abstraction for Host-Code Programming with OpenCL and CUDA

2018 IEEE 24th International Conference on Parallel and Distributed Systems (ICPADS)(2018)

引用 3|浏览6
暂无评分
摘要
The state-of-the-art parallel programming approaches OpenCL and CUDA require so-called host code for pro-gram's execution. Implementing host code is often a cumbersome task, especially when executing OpenCL and CUDA programs on systems with multiple devices, e.g., multi-core CPU and Graphics Processing Units (GPUs): the programmer is responsible for explicitly managing system's main memory and devices' memories, synchronizing computations with data transfers between main and/or devices' memories, and optimizing data transfers, e.g., by using pinned main memory for accelerating data transfers and overlapping the transfers with comnutations. In this paper, we present OCAL (DpenCL/CUDA Abstraction Layer) – a high-level approach to simplify the development of host code. OCAL combines five major advantages over the state-of-the-art high-level approaches: 1) it simplifies implementing both OpenCL and CUDA host code by providing a simple-to-use, uniform high-level host code abstraction API; 2) it supports executing arbitrary OpenCL and CUDA programs; 3) it simplifies implementing data-transfer optimizations by providing specially-optimized memory buffers, e.g., for conveniently using pinned main memory; 4) it optimizes memory management by automatically avoiding unnecessary data transfers; 5) it enables interoperability between OpenCL and CUDA host code by automatically managing the communication between OpenCL and CUDA data structures and by automatically translating between the OpenCL. and CUDA programming constructs. Our experiments demonstrate that OCAL significantly simplifies implementing host code with a low runtime overhead for abstraction.
更多
查看译文
关键词
Graphics processing units,Kernel,Data transfer,Memory management,Programming,Performance evaluation,Optimization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要