Performance optimization for parallel systems with shared DWM via retiming, loop scheduling, and data placement

Journal of Systems Architecture(2021)

引用 4|浏览11
暂无评分
摘要
Domain Wall Memory (DWM) as an ideal candidate for replacing traditional memories especially in parallel systems, has many desirable characteristics such as low leakage power, high density and low access latency. However, due to the tape-like architecture of DWM, shift operations have a vital impact on performance. Considering data-intensive applications with massive loops and arrays, increasing parallelism of loops, appropriate loop scheduling and data placement on DWM will significantly improve the performance of parallel systems. This paper explores optimizing performance of parallel systems through retiming, loop scheduling and data placement especially when the data are arrays. It proposes Integer Linear Programming (ILP) formulation and Scheduling While Placing (SWP) algorithm to generate optimal or nearly optimal loop scheduling and data placement with minimum execution time. The experimental results show that SWP and ILP can effectively reduce execution time when compared with greedy List Scheduling First Access First Place (LF) algorithm. Besides, this paper proposes Threshold Retiming Repetition (TRR) algorithm to combine the retiming technique with SWP and ILP. The experimental results show that SWP+TRR and ILP+TRR can further reduce the execution time when compared to results without retiming.
更多
查看译文
关键词
Domain wall memory,Loop scheduling,Data placement,Retiming,Shift operation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要