谷歌浏览器插件
订阅小程序
在清言上使用

An Optimized Straggler Mitigation Framework for Large-Scale Distributed Computing Systems

IEEE access(2022)

引用 1|浏览2
暂无评分
摘要
Nowadays, Big Data becomes a research focus in industrial, banking, social network, and other fields. In addition, the explosive increase of data and information require efficient processing solutions. Therefore, Spark is considered as a promising candidate of Large-Scale Distributed Computing Systems for big data processing. One primary challenge is the straggler problem that occurred due to the presence of heterogeneity where a machine takes an extra-long time to finish execution of a task, which decreases the system throughput. To mitigate straggler tasks, Spark adopts speculative execution mechanism, in which the scheduler launches additional backup to avoid slow task processing and achieve acceleration. In this paper, a new Optimized Straggler Mitigation Framework is proposed. The proposed framework uses a dynamic criterion to determine the closest straggler tasks. This criterion is based on multiple coefficients to achieve a reliable straggler decision. Also, it integrates the historical data analysis and online adaptation for intelligent straggler judgment. This guarantees the effectiveness of speculative tasks by improving cluster performance. Experimental results on various benchmarks and applications show that the proposed framework achieves 23.5% to 30.7% execution time reductions, and 25.4 to 46.3% increase of the cluster throughputs compared with spark engine.
更多
查看译文
关键词
Spark,straggler,speculative execution,cluster throughput
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要