The Broker Queue: A Fast, Linearizable FIFO Queue for Fine-Granular Work Distribution on the GPU.

ICS(2018)

引用 13|浏览124
暂无评分
摘要
Harnessing the power of massively parallel devices like the graphics processing unit (GPU) is difficult for algorithms that show dynamic or inhomogeneous workloads. To achieve high performance, such advanced algorithms require scalable, concurrent queues to collect and distribute work. We show that previous queuing approaches are unfit for this task, as they either (1) do not work well in a massively parallel environment, or (2) obstruct the use of individual threads on top of single-instruction-multiple-data (SIMD) cores, or (3) block during access, thus prohibiting multi-queue setups. With these issues in mind, we present the Broker Queue, a highly efficient, fully linearizable FIFO queue for fine-granular parallel work distribution on the GPU. We evaluate its performance and usability on modern GPU models against a wide range of existing algorithms. The Broker Queue is up to three orders of magnitude faster than nonblocking queues and can even outperform significantly simpler techniques that lack desired properties for fine-granular work distribution.
更多
查看译文
关键词
GPU, queuing, concurrent, parallel, scheduling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要