Slow and Steady: Measuring and Tuning Multicore Interference

Dan Iorga,Tyler Sorensen,John Wickerson,Alastair F. Donaldson

2020 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS)（2020）

引用 19|浏览51

暂无评分

摘要

Now ubiquitous, multicore processors provide replicated compute cores that allow independent programs to run in parallel. However, shared resources, such as last-level caches, can cause otherwise-independent programs to interfere with one another, leading to significant and unpredictable effects on their execution time. Indeed, prior work has shown that specially crafted enemy programs can cause software systems of interest to experience orders-of-magnitude slowdowns when both are run in parallel on a multicore processor. This undermines the suitability of these processors for tasks that have real-time constraints. In this work, we explore the design and evaluation of techniques for empirically testing interference using enemy programs, with an eye towards reliability (how reproducible the interference results are) and portability (how interference testing can be effective across chips). We first show that different methods of measurement yield significantly different magnitudes of, and variation in, observed interference effects when applied to an enemy process that was shown to be particularly effective in prior work. We propose a method of measurement based on percentiles and confidence intervals, and show that it provides both competitive and reproducible observations. The reliability of our measurements allows us to explore auto-tuning, where enemy programs are further specialised per architecture. We evaluate three different tuning approaches (random search, simulated annealing, and Bayesian optimisation) on five different multicore chips, spanning x86 and ARM architectures. To show that our tuned enemy programs generalise to applications, we evaluate the slowdowns caused by our approach on the AutoBench and CoreMark benchmark suites. Our method achieves a statistically larger slowdown compared to prior work in 35 out of 105 benchmarldchip combinations, with a maximum difference of $ 3.8\times$. We envision that empirical approaches, such as ours, will be valuable for ‘first pass’ evaluations when investigating which multicore processors are suitable for real-time tasks.

查看译文

关键词

multicore chips,tuned enemy programs,statistically larger slowdown,empirical approaches,multicore processor,real-time tasks,ubiquitous processors,multicore processors,compute cores,shared resources,last-level caches,otherwise-independent programs,significant effects,unpredictable effects,execution time,specially crafted enemy programs,orders-of-magnitude slowdowns,real-time constraints,empirically testing interference,interference results,interference testing,measurement yield significantly different magnitudes,observed interference effects,enemy process,competitive observations,reproducible observations,auto-tuning,tuning approaches,AutoBench,CoreMark benchmark suites,ARM architectures,x86 architectures

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要