Chrome Extension
WeChat Mini Program
Use on ChatGLM

Lattice Gauge Theory on a Multi-Core Processor, Cell/B.E

Procedia computer science(2011)

Cited 0|Views1
No score
Abstract
We report our implementation experience of a lattice gauge theory code on the Cell Broadband Engine, which is a new heterogeneous multi-core processor. As a typical operation, we take a SU(3) matrix multiplication which is one of the most important parts of lattice gauge theories. Employing full advantage of the Cell/B.E. including SIMD operations and many registers, which enable the full use of the arithmetic units through the loop-unrolling, we obtain about 200 GFLOPS with 16 SPE, which corresponds around 80% of the theoretical peak. To our knowledge, this is the fastest value of this operation obtained on the Cell/B.E. so far. However, when we measure the whole time including the data supply, the speed drops down to about 13 GFLOPS.We found that the bandwidth of the data transfer between the main memory and EIB, 25 GB/s, is a bottleneck. In other words, it is possible to run the arithmetic units on the Cell/B.E. with 200 GFLOPS speed, but the current socket structure of Cell/B.E. prevents it. We discuss several techniques to improve the problem partially by reducing the transferred data.
More
Translated text
Key words
Cell/B.E.,Multi-core,Lattice gauge theory,lattice QCD
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined