A scalable, high-performance customized priority queue

FPL(2014)

引用 21|浏览58
暂无评分
摘要
Priority queues are abstract data structures where each element is associated with a priority, and the highest priority element is always retrieved first from the queue. The data structure is widely used within databases, including the last stage of a merge-sort, forecasting read-ahead I/O to stream data for the merge-sort, and replacement selection sort. Typical software implementations use a balanced binary tree-based structure, providing O(log N) time for both enqueue and dequeue operations. To improve the performance, we propose several scalable and high-speed FPGA-based implementations of a priority queue. Our insight is that the above listed applications primarily use priority queues through “replace” operations, which remove the highest priority element and place a new element into the queue. Thus, our designs are customized for this operation, allowing for a simple and scalable architecture. We implement three priority queue designs, including use of a register-based array, register-based tree, and BRAM-based tree, which have different benefits and trade-offs of throughput, frequency, and maximum size. More importantly, all designs achieve O(1) time between replace operations. To incorporate the best aspects of our designs, we propose a Hybrid Priority Queue (H-PQ), which combines a register-based array with multiple BRAM-based trees. This design provides, on average, very fast access times to the top items in the queue (through the register-based array), while scaling to large priority queue sizes (through the BRAM-based trees). In our evaluations, we find that H-PQ achieves 4.3x speedup and 21.5x energy efficiency, compared with the Xeon CPU implementations.
更多
查看译文
关键词
queue size,register-based tree,priority element,replacement selection sort,merge-sort forecasting read-ahead input-output,dequeue operation,merge-sort,abstract data structures,data structures,enqueue operation,high-performance customized priority queue,field programmable gate array,replace operation,fpga-based implementation,register-based array,xeon cpu implementation,field programmable gate arrays,energy efficiency,bram-based tree,database systems,registers,throughput
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要