Javelin: A Scalable Implementation for Sparse Incomplete LU Factorization

2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)（2019）

引用 2|浏览4

暂无评分

摘要

In this work, we present a new scalable incomplete LU factorization framework called Javelin to be used as a preconditioner for solving sparse linear systems with iterative methods. Javelin allows for improved parallel factorization on shared-memory many-core systems by packaging the coefficient matrix into a format that allows for high performance sparse matrix-vector multiplication and sparse triangular solves with minimal overheads. The framework achieves these goals by using a collection of traditional permutations, point-to-point thread synchronizations, tasking, and segmented prefix scans in a conventional compressed sparse row format. Moreover, this framework stresses the importance of co-designing dependent tasks, such as sparse factorization and triangular solves, on highly-threaded architectures. Using these changes, traditional fill-in and drop tolerance methods can be used, while still being able to have observed speedups of up to ~ 42× on 68 Intel Knights Landing cores and ~ 12× on 14 Intel Haswell cores.

查看译文

关键词

preconditioner,linear algebra,manycore

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要