CREPE: Concurrent Reverse-modulo-scheduling and Placement for CGRAs

Chilankamol Sunny, Satyajit Das,Kevin J. M. Martin,Philippe Coussy

IEEE Transactions on Parallel and Distributed Systems(2024)

引用 0|浏览0
暂无评分
摘要
Coarse-Grained Reconfigurable Array (CGRA) architectures are popular as high-performance and energy-efficient computing devices. Compute-intensive loop constructs of complex applications are mapped onto CGRAs by modulo-scheduling the innermost loop dataflow graph (DFG). In the state-of-the-art approaches, mapping quality is typically determined by initiation interval ( II ), while schedule length for one iteration is neglected. However, for nested loops, schedule length becomes important. In this paper, we propose CREPE, a C oncurrent Re verse-modulo-scheduling and P lac e ment technique for CGRAs that minimizes both II and schedule length . CREPE performs simultaneous modulo-scheduling and placement coupled with dynamic graph transformations, generating good-quality mappings with high success rates. Furthermore, we introduce a compilation flow that maps nested loops onto the CGRA and modulo-schedules the innermost loop using CREPE. Experiments show that the proposed solution outperforms the conventional approaches in mapping success rate and total execution time with no impact on the compilation time. CREPE maps all kernels considered while state-of-the-art techniques Crimson and Epimap failed to find a mapping or mapped at very high II s. On a 2×4 CGRA, CREPE reports a 100% success rate and a speed-up up to 5.9× and 1.4× over Crimson with 78.5% and Epimap with 46.4% success rates respectively.
更多
查看译文
关键词
coarse-grained reconfigurable array (CGRA),modulo-scheduling,loop optimization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要