O2ath: an OpenMP offloading toolkit for the sunway heterogeneous manycore platform

CCF Transactions on High Performance Computing(2024)

引用 0|浏览21
暂无评分
摘要
The next generation Sunway supercomputer employs the SW26010pro processor, which features a specialized on-chip heterogeneous architecture. Applications with significant hotspots can benefit from the great computation capacity of Sunway many-core architectures by carefully making intensive manual many-core parallelization efforts. However, some projects with large codebases contain numerous lines of code and do not have significant hotspots. The cost of manually porting such applications to the Sunway architecture is almost unaffordable. To overcome such a challenge, we have developed a toolkit named O2ATH. O2ATH forwards GNU OpenMP runtime library calls to Sunway’s Athread library, which greatly simplifies the parallelization work on the Sunway architecture. O2ATH enables users to write both MPE and CPE code in a single file, and parallelization can be achieved by utilizing OpenMP directives and attributes. Users can efficiently port projects with large codebases to CPEs and achieve a competitive speedup by using O2ATH. In practice, O2ATH has helped us to port two large projects, CESM and ROMS. In the experiments, compared to the kernels that only run on MPEs, kernels using O2ATH achieved speedups ranging from 3 to 15 times, resulting in 3 to 6 times whole application speedups. Furthermore, O2ATH requires significantly fewer code modifications compared to manually crafting CPE functions. This indicates that O2ATH can greatly enhance development efficiency when porting or optimizing large software projects on Sunway supercomputers.
更多
查看译文
关键词
Heterogeneous architecture,Non-intrusive proxy toolkit,OpenMP offloading,Optimizations
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要