Hybrid Evaluation for Distributed Iterative Matrix Computation
International Conference on Management of Data(2021)
摘要
ABSTRACTDistributed matrix computation is common in large-scale data processing and machine learning applications. Existing systems that support distributed matrix computation already explore incremental evaluation for iterative-convergent algorithms. However, they are oblivious to the fact that non-zero increments are scattered in different blocks in a distributed environment. Additionally, we observe that incremental evaluation does not always outperform full evaluation. To address these issues, we propose matrix reorganization to optimize the physical layout upon the state-of-art optimized partition schemes, and thereby accelerate the incremental evaluation. More importantly, we propose a hybrid evaluation to efficiently interleave full and incremental evaluation during the iterative process. In particular, it employs a cost model to compare the overhead costs of two types of evaluations and a selective comparison mechanism to reduce the overhead incurred by comparison itself. To demonstrate the efficiency of our techniques, we implement HyMAC, a hybrid matrix computation system based on SystemML. Our experiments show that HyMAC reduces execution time on large datasets by 23% on average in comparison to the state-of-art optimization technique and consequently outperforms SystemML, ScaLAPACK, and SciDB by an order of magnitude.
更多查看译文
关键词
Matrix Computation, Hybrid Evaluation, Iteration
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要