Learning from Automatically Versus Manually Parallelized NAS Benchmarks.

LCPC（2022）

引用 0|浏览6

暂无评分

摘要

By comparing automatically versus manually parallelized NAS Benchmarks, we identify code sections that differ, and we discuss opportunities for advancing auto-parallelizers. We find ten patterns that challenge current parallelization technology. We also measure the potential impact of advanced techniques that could perform the needed transformations automatically. While some of our findings are not surprising and difficult to attain – compilers need to get better at identifying parallelism in outermost loops and in loops containing function calls – other opportunities are within reach and can make a difference. They include combining loops into parallel regions, avoiding load imbalance, and improving reduction parallelization. Advancing compilers through the study of hand-optimized code is a necessary path to move the forefront of compiler research. Very few recent papers have pursued this goal, however. The present work tries to fill this void.

查看译文

关键词

learning

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要