Packet coalescing exploiting data redundancy in GPGPU architectures.

ICS(2017)

引用 15|浏览36
暂无评分
摘要
General Purpose Graphics Processing Units (GPGPUs) are becoming a cost-effective hardware approach for parallel computing. Many executions on the GPGPUs place heavy stress on the memory system, creating network bottlenecks near memory controllers. We observe that data redundancy in communication traffic is common-place across a wide range of GPGPU applications. To exploit the data redundancy, we propose a packet coalescing mechanism to alleviate the network bottlenecks by directly reducing the traffic volume. The key idea is to coalesce multiple packets into one without increasing the packet size when they carry redundant cache blocks. To ensure that the coalesced packets are delivered to their respective destinations, we adopt multicast routing for the interconnection network of GPGPUs. Our coalescing approach yields 15% IPC improvement (up to 112%) in a large-scale GPGPU with 2D mesh across various GPGPU applications, by reducing average memory access time (AMAT) by 15.5% (up to 65.2%) and obtaining network bandwidth savings by 13% (up to 37%). Also, our coalescing approach achieves 7% IPC improvement in the NVIDIA Fermi architecture with the crossbar.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要