Javelin: A Scalable Implementation for Sparse Incomplete LU Factorization

2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)(2019)

引用 2|浏览4
暂无评分
摘要
In this work, we present a new scalable incomplete LU factorization framework called Javelin to be used as a preconditioner for solving sparse linear systems with iterative methods. Javelin allows for improved parallel factorization on shared-memory many-core systems by packaging the coefficient matrix into a format that allows for high performance sparse matrix-vector multiplication and sparse triangular solves with minimal overheads. The framework achieves these goals by using a collection of traditional permutations, point-to-point thread synchronizations, tasking, and segmented prefix scans in a conventional compressed sparse row format. Moreover, this framework stresses the importance of co-designing dependent tasks, such as sparse factorization and triangular solves, on highly-threaded architectures. Using these changes, traditional fill-in and drop tolerance methods can be used, while still being able to have observed speedups of up to ~ 42× on 68 Intel Knights Landing cores and ~ 12× on 14 Intel Haswell cores.
更多
查看译文
关键词
preconditioner,linear algebra,manycore
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要