Run-Jump-Run: Bouquet of Instruction Pointer Jumpers for High Performance Instruction Prefetching
semanticscholar(2020)
摘要
Large instruction working sets are common with modern client and server workloads. These working sets often fit in the large last-level cache (LLC). However, the L1 instruction cache (L1-I) suffers from a high miss rate blocking the instruction supply to the front-end of the processor. Instruction prefetching is a latency hiding technique that can bring instructions from the LLC into the L1-I. We propose a bouquet of instruction pointer (IP) jumpers, named JIP. JIP is a high-performance L1-I prefetcher that uses different prefetching techniques by classifying instructions into the following categories: (i) a nonbranch, (ii) a branch that jumps to a single target IP on all instances, and (iii) a branch that jumps to different target IPs on different instances. Compared to a baseline with no instruction prefetching, averaged across 50 traces, JIP provides a prefetch coverage of 91.33% (as high as 99.99%), which leads to a performance improvement of 27.75% (as high as 93%). JIP makes a strong case for instruction prefetching as the performance gap between the perfect L1-I and JIP is just 7.49%. JIP demands a hardware overhead of 127.8 KB.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要