Cams As Synchronizing Caches For Multithreaded Irregular Applications On Fpgas

International Conference on Computer Aided Design(2015)

引用 4|浏览39
暂无评分
摘要
Irregular applications, by their very nature, suffer from poor data locality. This often results in high miss rates for caches, and many long waits to off-chip memory. Historically, long latencies have been dealt with in two ways: (1) latency mitigation using large cache hierarchies, or (2) latency masking where threads relinquish their control after issuing a memory request. Multithreaded CPUs are designed for a fixed maximum number of threads tailored for an average application. FPGAs, however, can be customized to specific applications. Their massive parallelism is well known, and ideally suited to dynamically manage hundreds, or thousands of threads. Multithreading, in essence, trades memory bandwidth for latency. Therefore, to achieve a high throughput, the system must support a large memory bandwidth. Many irregular application, however, must rely on inter-thread synchronization for parallel execution. In-memory synchronization suffers from very long memory latencies. In this paper we describe the use of CAMs (Content Addressable Memories) as synchronizing caches for hardware multithreading. We demonstrate and evaluate this mechanism using graph breadth-first search (BFS).
更多
查看译文
关键词
FPGA,Multithreading,bandwidth,CAM,synchronizing caches
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要