Investigating the performance of parallel eigensolvers for large processor counts
Theoretica chimica acta(1993)
摘要
In this paper, we compare the performance of several parallel and serial methods for solving dense real symmetric eigensystems on a distributed memory message passing parallel computer. We focus on matrices of sizeN=200 and processor countsP=1 toP=512, with execution on the Intel Touchstone DELTA computer. The best eigensolver method is found to depend on the number of available processors. Of the methods tested, a recently developed Blocked Factored Jacobi (BFJ) method is the slowest for smallP, but the fastest for largeP. Its speed is a complicated non-monotonic function of the number of processors used. A detailed performance analysis of the BFJ method shows that: (1) the factor most responsible for limited speedup is communication startup cost; (2) with current communication costs, the maximum achievable parallel speedup is modest (one order of magnitude) compared to the best serial method; and (3) the fastest solution is often achieved by using less than the maximum number of available processors.
更多查看译文
关键词
Eigensolving,Massively parallel computers,Small dense matrices
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络