Optimizing LOBPCG: Sparse Matrix Loop and Data Transformations in Action.

Lecture Notes in Computer Science(2017)

引用 13|浏览19
暂无评分
摘要
Sparse matrix computations are widely used in iterative solvers; they are notoriously memory bound and typically yield poor performance on modern architectures. A common optimization strategy for such computations is to rely on specialized representations that exploit the nonzero structure of the sparse matrix in an application-specific way. Recent research has developed loop and data transformations for sparse matrix computations in a polyhedral compilation framework. In this paper, we apply these and additional loop transformations to a real application code, the LOBPCG solver, which performs a Sparse Matrix Multi-Vector (SpMM) computation at each iteration. The paper presents the transformation derivation for this application code and resulting performance. The compiler-generated code attains a speedup of up to 8.26x on 8 threads on an Intel Haswell and 30 GFlops; it outperforms a state-of-the-art manually-written Fortran implementation by 3%.
更多
查看译文
关键词
Sparse Matrix Computations, Code Generation Framework, Column Index Array, Optimal Block Preconditioned Conjugate Gradient, Block Krylov Subspace Methods
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要