Fine Grain Algorithm Parallelization on a Hybrid Control-flow and Dataflow Processor

Research Square (Research Square)(2023)

引用 0|浏览5
暂无评分
摘要
Abstract The execution time of a high performance computing algorithm depends on multiple factors: the algorithm scalability, the chosen hardware, the communication speed between processing elements, etc. This work is based on a hybrid processor consisting of both the control-flow and dataflow hardware, where the control-flow hardware includes manycore architecture. The algorithm decomposition into portions suitable for different architectures is presented. The results are presented by decomposing the Lattice-Boltzmann method implemented for both control-flow and dataflow hardware. The total acceleration factor of the decomposed Lattice-Boltzmann method comparing to the execution time using the control-flow and the dataflow hardware is obtained by the analysis based on the speed of the communication between both hardware equal to the speed of shared cache memories. Results indicate the advantage of using the proposed hybrid architecture and the capability of accelerating considerably even the suitable for dataflow architectures. The analysis of the problem size needed to justify the utilization of control-flow hardware is also presented. The main benefit is in accelerating algorithms for which only some algorithm portions are suitable for the dataflow hardware.
更多
查看译文
关键词
fine grain algorithm parallelization,control-flow
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要