Less for More: Reducing Intra-CGRA Connectivity for Higher Performance and Efficiency in HPC

IPDPS Workshops(2023)

引用 0|浏览10
暂无评分
摘要
Coarse-Grained Reconfigurable Arrays (CGRAs) are a class of reconfigurable architectures that inherit the performance of Domain-specific accelerators and the reconfigurability aspects of Field-Programmable Gate Arrays (FPGAs). Historically, CGRAs have been successfully used to accelerate embedded applications and are now considered to accelerate High-Performance Computing (HPC) applications in future supercomputers. However, embedded systems and supercomputers are two vastly different domains with different applications and constraints, and it is today not fully understood what CGRA design decisions adequately cater to the HPC market. One such unknown design decision is regarding the interconnect that facilitates intra-CGRA communication. Our findings show that even the typical king-style mesh-like topology is often underutilized with a typical HPC workload, leading to inefficiency. This research aims to explore the provisioning of intra-CGRA interconnect for HPC-oriented workloads and, ultimately, recoup the potential performance and efficiency lost by reducing the interconnect complexity. We proposed several reduced interconnect topologies based on the usage statistic. Then we evaluate the tradeoffs regarding hardware cost, routability of DFGs, and computational throughput.
更多
查看译文
关键词
CGRA,Routing architecture,Design space exploration,HPC,RTL simulation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要