Increment-and-Freeze: Every Cache, Everywhere, All of the Time
PROCEEDINGS OF THE 35TH ACM SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURES, SPAA 2023(2023)
摘要
One of the most basic algorithmic problems concerning caches is to compute the LRU hit-rate curve on a given trace. Unfortunately, the known algorithms exhibit poor data locality and fail to scale to large caches. It is widely believed that the LRU hit-rate curve cannot be computed efficiently enough to be used in online production settings. This has led to a large literature on heuristics that aim to approximate the curve efficiently. In this paper, we show that the poor data locality of past algorithms can be avoided. We introduce a new algorithm, called INCREMENT-AND-FREEZE, for computing exact LRU hit-rate curves. The algorithm achieves RAM-model complexity O(n log n), external-memory complexity O(n/B log n), and parallelism Theta(log n). We also present two theoretical extensions of INCREMENT-AND-FREEZE, one that achieves SORT complexity in the external-memory model, and one that achieves a parallel span of O(log(2) n) which is near linear parallelism, while maintaining work efficiency. We implement INCREMENT-AND-FREEZE [5] and obtain a speedup of up to 9x over the classical augmented-tree algorithm on a single processor. On 16 threads, the speedup becomes as large as 60x. In comparison to the previous state-of-the-art parallel algorithm, INCREMENT-AND-FREEZE achieves a speedup of up to 10x when both algorithms use the same number of threads.
更多查看译文
关键词
LRU,Hit-rate Curves,Miss-ratio Curves,Success Function,Reuse Distance,Stack Distance,Working Set,Divide-And-Conquer,Caching,External-Memory,IO-Optimal,Parallelism
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要