Auto Adaptive Irregular OpenMP Loops

arxiv(2021)

引用 0|浏览0
暂无评分
摘要
OpenMP is a standard for the parallelization due to the ease in programming parallel-for loops in a fork-join manner. Many shared-memory applications are implemented using this model despite not being ideal for applications with high load imbalance, such as those that make irregular memory accesses. One parameter, i.e., chunk size, is made available to users in order to mitigate performance loss. However, this parameter is dependent on architecture, system load, application, and input; making it difficult to tune. We present an OpenMP scheduler that does an adaptive tuning for chunk size for unbalanced applications that make irregular memory accesses. In particular, this method(iCh) uses work-stealing for imbalance and adapts chunk size using a force-feedback model that approximates variance of task length in a chunk. This scheduler has low overhead and allows for active load balancing while the applications are running. We demonstrate this using both sparse matrix-vector multiplication (spmv) and Betweenness Centrality (bc) and show that iCh can achieve average speedups close (i.e., within 1.061x for spmv and 1.092x for bc) of either OpenMP loops scheduled with dynamic or work-stealing methods that had chunk size tuned offline.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要