Dissecting Cyclops: a detailed analysis of a multithreaded architecture.

SIGARCH Computer Architecture News(2003)

引用 72|浏览93
暂无评分
摘要
Multiprocessor systems-on-a-chip offer a structured approach to managing complexity in chip design. Cyclops is a new family of multithreaded architectures which integrates processing logic, main memory and communications hardware on a single chip. Its simple, hierarchical design allows the hardware architect to manage a large number of components to meet the design constraints in terms of performance, power or application domain.This paper evaluates several alternative Cyclops designs with different relative costs and trade-offs. We compare the performance of several scientific kernels running on different configurations of this architecture. We show that by increasing the number of threads sharing a floating point unit we can hide fairly high cache and memory latencies. We prove that we can reach the theoretical peak performance of the chip and we identify the optimal balance of components for each application. We demonstrate that the design is well adapted to solve problems that are difficult to optimize. For example, we show that sparse matrix vector multiplication obtains 16 GFlops out of 32 GFlops of peak performance.
更多
查看译文
关键词
detailed analysis,communications hardware,application domain,design constraint,different configuration,hierarchical design,alternative cyclops design,chip design,single chip,theoretical peak performance,multithreaded architecture,dissecting cyclops,peak performance,chip,floating point unit,memory latency,system on a chip,sparse matrix
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要