PID-Comm: A Fast and Flexible Collective Communication Framework for Commodity Processing-in-DIMM Devices

Si Ung Noh, Junguk Hong, Chaemin Lim,Seongyeon Park, Jeehyun Kim,Hanjun Kim,Youngsok Kim, Jinho Lee

CoRR(2024)

引用 0|浏览0
暂无评分
摘要
Recent dual in-line memory modules (DIMMs) are starting to support processing-in-memory (PIM) by associating their memory banks with processing elements (PEs), allowing applications to overcome the data movement bottleneck by offloading memory-intensive operations to the PEs. Many highly parallel applications have been shown to benefit from these PIM-enabled DIMMs, but further speedup is often limited by the huge overhead of inter-PE communication. This mainly comes from the slow CPU-mediated inter-PE communication methods which incurs significant performance overheads, making it difficult for PIM-enabled DIMMs to accelerate a wider range of applications. Prior studies have tried to alleviate the communication bottleneck, but they lack enough flexibility and performance to be used for a wide range of applications. In this paper, we present PID-Comm, a fast and flexible collective inter-PE communication framework for commodity PIM-enabled DIMMs. The key idea of PID-Comm is to abstract the PEs as a multi-dimensional hypercube and allow multiple instances of collective inter-PE communication between the PEs belonging to certain dimensions of the hypercube. Leveraging this abstraction, PID-Comm first defines eight collective inter-PE communication patterns that allow applications to easily express their complex communication patterns. Then, PID-Comm provides high-performance implementations of the collective inter-PE communication patterns optimized for the DIMMs. Our evaluation using 16 UPMEM DIMMs and representative parallel algorithms shows that PID-Comm greatly improves the performance by up to 4.20x compared to the existing inter-PE communication implementations. The implementation of PID-Comm is available at https://github.com/AIS-SNU/PID-Comm.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要