LUCAS: latency-adaptive unified cluster assignment and instruction scheduling

LCTES（2013）

引用 4|浏览15

暂无评分

摘要

Clustered VLIW architectures are statically scheduled wide-issue architectures that combine the advantages of wide-issue processors along with the power and frequency scalability of clustered designs. Being statically scheduled, they require that the decision of mapping instructions to clusters be done by the compiler. State-of-the-art code generation for such architectures combines cluster-assignment and instruction scheduling in a single unified pass. The performance of the generated code, however, is very susceptible to the inter-cluster communication latency. This is due to the nature of the two clustering heuristics used. One is aggressive and works well for low inter-cluster latencies, while the other is more conservative and works well only for high latencies. In this paper we propose LUCAS, a novel unified cluster-assignment and instruction-scheduling algorithm that adapts to the inter-cluster latency better than the existing state-of-the-art schemes. LUCAS is a hybrid scheme that performs fine-grain switching between the two state-of-the art clustering heuristics, leading to better scheduling than either of them. It generates better performing code for a wide range of inter-cluster latency values.

查看译文

关键词

inter-cluster communication latency,inter-cluster latency,low inter-cluster latency,novel unified cluster-assignment,high latency,state-of-the-art code generation,latency-adaptive unified cluster assignment,better scheduling,instruction scheduling,clustering heuristics,inter-cluster latency value

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要