Affinity-aware work-stealing for integrated CPU-GPU processors.

PPOPP(2016)

引用 8|浏览155
暂无评分
摘要
Recent integrated CPU-GPU processors like Intel's Broadwell and AMD's Kaveri support hardware CPU-GPU shared virtual memory, atomic operations, and memory coherency. This enables fine-grained CPU-GPU work-stealing, but architectural differences between the CPU and GPU hurt the performance of traditionally-implemented work-stealing on such processors. These architectural differences include different clock frequencies, atomic operation costs, and cache and shared memory latencies. This paper describes a preliminary implementation of our work-stealing scheduler, Libra, which includes techniques to deal with these architectural differences in integrated CPU-GPU processors. Libra's affinity-aware techniques achieve significant performance gains over classically-implemented work-stealing. We show preliminary results using a diverse set of nine regular and irregular workloads running on an Intel Broadwell Core-M processor. Libra currently achieves up to a 2× performance improvement over classical work-stealing, with a 20% average improvement.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要