Optimizing Matrix Multiplication on Intel® Xeon Phi TH x200 Architecture

Murat Efe Guney, Kazushige Goto,Timothy B. Costa,Sarah Knepper, Louise Huot, Arthur Mitrano, Shane Story

2017 IEEE 24th Symposium on Computer Arithmetic (ARITH)(2017)

引用 0|浏览1
暂无评分
摘要
Matrix multiplication is ubiquitous in scientific computing. From computational science to machine learning, a large and diverse set of applications rely on the performance of general matrix-matrix multiplication (GEMM) subroutines. The Intel ® Math Kernel Library(R) provides highly optimized GEMM subroutines that take full advantage of the available parallelism and vectorization in both Intel ® Xeon ® and Intel ® Xeon Phi(TM) processors. In this paper we discuss the optimization of GEMM subroutines for the Intel ® Xeon Phi TM x200 (code-named Knights Landing).
更多
查看译文
关键词
matrix multiplication,intel xeon phi,blas,performance optimization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要