Increment-and-Freeze: Every Cache, Everywhere, All of the Time

PROCEEDINGS OF THE 35TH ACM SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURES, SPAA 2023(2023)

引用 0|浏览14
暂无评分
摘要
One of the most basic algorithmic problems concerning caches is to compute the LRU hit-rate curve on a given trace. Unfortunately, the known algorithms exhibit poor data locality and fail to scale to large caches. It is widely believed that the LRU hit-rate curve cannot be computed efficiently enough to be used in online production settings. This has led to a large literature on heuristics that aim to approximate the curve efficiently. In this paper, we show that the poor data locality of past algorithms can be avoided. We introduce a new algorithm, called INCREMENT-AND-FREEZE, for computing exact LRU hit-rate curves. The algorithm achieves RAM-model complexity O(n log n), external-memory complexity O(n/B log n), and parallelism Theta(log n). We also present two theoretical extensions of INCREMENT-AND-FREEZE, one that achieves SORT complexity in the external-memory model, and one that achieves a parallel span of O(log(2) n) which is near linear parallelism, while maintaining work efficiency. We implement INCREMENT-AND-FREEZE [5] and obtain a speedup of up to 9x over the classical augmented-tree algorithm on a single processor. On 16 threads, the speedup becomes as large as 60x. In comparison to the previous state-of-the-art parallel algorithm, INCREMENT-AND-FREEZE achieves a speedup of up to 10x when both algorithms use the same number of threads.
更多
查看译文
关键词
LRU,Hit-rate Curves,Miss-ratio Curves,Success Function,Reuse Distance,Stack Distance,Working Set,Divide-And-Conquer,Caching,External-Memory,IO-Optimal,Parallelism
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要