Increment-and-Freeze: Every Cache, Everywhere, All of the Time

Michael A. Bender,Daniel DeLayo,Bradley C. Kuszmaul,William Kuszmaul,Evan West

PROCEEDINGS OF THE 35TH ACM SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURES, SPAA 2023（2023）

引用 0|浏览14

暂无评分

摘要

One of the most basic algorithmic problems concerning caches is to compute the LRU hit-rate curve on a given trace. Unfortunately, the known algorithms exhibit poor data locality and fail to scale to large caches. It is widely believed that the LRU hit-rate curve cannot be computed efficiently enough to be used in online production settings. This has led to a large literature on heuristics that aim to approximate the curve efficiently. In this paper, we show that the poor data locality of past algorithms can be avoided. We introduce a new algorithm, called INCREMENT-AND-FREEZE, for computing exact LRU hit-rate curves. The algorithm achieves RAM-model complexity O(n log n), external-memory complexity O(n/B log n), and parallelism Theta(log n). We also present two theoretical extensions of INCREMENT-AND-FREEZE, one that achieves SORT complexity in the external-memory model, and one that achieves a parallel span of O(log(2) n) which is near linear parallelism, while maintaining work efficiency. We implement INCREMENT-AND-FREEZE [5] and obtain a speedup of up to 9x over the classical augmented-tree algorithm on a single processor. On 16 threads, the speedup becomes as large as 60x. In comparison to the previous state-of-the-art parallel algorithm, INCREMENT-AND-FREEZE achieves a speedup of up to 10x when both algorithms use the same number of threads.

查看译文

关键词

LRU,Hit-rate Curves,Miss-ratio Curves,Success Function,Reuse Distance,Stack Distance,Working Set,Divide-And-Conquer,Caching,External-Memory,IO-Optimal,Parallelism

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要