A scalable communication-aware compilation flow for programmable accelerators

2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC)(2016)

引用 11|浏览70
暂无评分
摘要
Programmable accelerators (PA) are receiving increased attention in domain-specific architecture designs to provide more general support for customization. In a PA-rich system, computational kernels are compiled into predefined PA templates and dynamically mapped to real PAs at runtime. This imposes a demanding challenge on the compiler side - that is, how to generate high-quality PA mapping code. Another important concern is the communication cost among PAs: if not handled properly at compile time, data transfers among tens or hundreds of accelerators in a PA-rich system will limit the overall performance gain. In this paper we present an efficient PA compilation flow, which is scalable for mapping large computation kernels into PA-rich architectures. Communication overhead is modeled and optimized in the proposed flow to reduce runtime data transfers among accelerators. Experimental results show that for 12 computation-intensive standard benchmarks, the proposed approach significantly improves compilation scalability, mapping quality and overall communication cost compared to state-of-art PA compilation approaches. We also evaluate the proposed flow on a recently developed PA-rich platform [1]; the final performance gain is improved by 49.5% on average.
更多
查看译文
关键词
scalable communication-aware compilation,programmable accelerator,high-quality PA mapping code,PA-rich architecture
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要