Optimization of Parallel I/O for Cannon's Algorithm Based on Lustre

Distributed Computing and Applications to Business, Engineering & Science(2012)

引用 0|浏览9
Matrix multiplication is one of the most important operations in linear algebra, widely used in many fields of science and engineering. Cannon's algorithm is a classical distributed algorithm for matrix multiplication for two-dimensional meshes. Generally, MPI-IO is used for its I/O requirements. However it has been well documented that MPI-IO performs poorly in a Lustre file system environment. As the scale of matrix multiplication increased, this problem trends to be serious, becoming one key factor impacting performance of the program. In order to improve the performance of Collective I/O of Cannon's program, we proposed a new aggregation pattern (Stripe-continuous aggregation pattern), which fully considers the stripping mechanism and lock protocol of Lustre file system. The theoretical analysis and experimental results show that the pattern can make full use of the capacity of Lustre file system compared with the other patterns, and improve the I/O performance of the Cannon's program efficiently.
matrix multiplication,parallel processing,collective i/o,lock protocol,new aggregation pattern,cannon's algorithm,parallel i/o,linear algebra,important operation,full use,2d mesh,lustre file system environment,input-output programs,o performance,message passing interface,mathematics computing,stripping mechanism,file organisation,parallel input-output optimization,lustre file system,cannon algorithm,mpi-io,message passing,o requirement,two-dimensional mesh,stripe-continuous aggregation pattern,protocols,cannon s algorithm,computer architecture,sparse matrices,throughput,servers
AI 理解论文