Optimizing Matrix Multiplication on Intel® Xeon Phi TH x200 Architecture

Murat Efe Guney, Kazushige Goto,Timothy B. Costa,Sarah Knepper, Louise Huot, Arthur Mitrano, Shane Story

2017 IEEE 24th Symposium on Computer Arithmetic (ARITH)（2017）

引用 0|浏览1

暂无评分

摘要

Matrix multiplication is ubiquitous in scientific computing. From computational science to machine learning, a large and diverse set of applications rely on the performance of general matrix-matrix multiplication (GEMM) subroutines. The Intel ^® Math Kernel Library(R) provides highly optimized GEMM subroutines that take full advantage of the available parallelism and vectorization in both Intel ^® Xeon ^® and Intel ^® Xeon Phi(TM) processors. In this paper we discuss the optimization of GEMM subroutines for the Intel ^® Xeon Phi ^TM x200 (code-named Knights Landing).

查看译文

关键词

matrix multiplication,intel xeon phi,blas,performance optimization

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要