NAS Parallel Benchmarks with Python: a performance and programming effort analysis focusing on GPUs

The Journal of Supercomputing(2022)

引用 0|浏览4
暂无评分
摘要
Compiled low-level languages, such as C/C++ and Fortran, have been employed as programming tools to implement applications to explore GPU devices. As a counterpoint to that trend, this paper presents a performance and programming effort analysis with Python, an interpreted and high-level language, which was applied to develop the kernels and applications of NAS Parallel Benchmarks targeting GPUs. We used Numba environment to enable CUDA support in Python, a tool that allows us to implement the GPU programs with pure Python code. Our experimental results showed that Python applications reached a performance similar to C++ programs employing CUDA and better than C++ using OpenACC for most NPB benchmarks. Furthermore, Python codes demanded less operations related to the GPU framework than CUDA, mainly because Python needs a lower number of statements to manage memory allocations and data transfers. Despite that, our Python implementations required more operations than OpenACC ones.
更多
查看译文
关键词
NPB,GPU,Python,Numba,Programming effort
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要