Modeling Memory Contention between Communications and Computations in Distributed HPC Systems

2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)(2022)

引用 2|浏览7
暂无评分
摘要
To amortize the cost of MPI communications, distributed parallel HPC applications can overlap network communications with computations in the hope that it improves global application performance. When using this technique, both computations and communications are running at the same time. But computation usually also performs some data movements. Since data for computations and for communications use the same memory system, memory contention may occur when computations are memory-bound and large messages are transmitted through the network at the same time. In this paper we propose a model to predict memory band-width for computations and for communications when they are executed side by side, according to data locality and taking contention into account. Elaboration of the model allowed to better understand locations of bottleneck in the memory system and what are the strategies of the memory system in case of contention. The model was evaluated on many platforms with different characteristics, and showed a prediction error in average lower than 4 %.
更多
查看译文
关键词
HPC,MPI,Memory Contention,NUMA,Band-width,Predictive Models,Multicore Processing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络