Handling Non-Local Executions To Improve Mapreduce Performance Using Ant Colony Optimization

IEEE ACCESS(2021)

引用 3|浏览3
暂无评分
摘要
Improving the performance of MapReduce scheduler is a primary objective, especially in a heterogeneous virtualized cloud environment. A map task is typically assigned with an input split, which consists of one or more data blocks. When a map task is assigned to more than one data block, non-local execution is performed. In classical MapReduce scheduling schemes, data blocks are copied over the network to a node where the map task is running. This increases job latency and consumes considerable network bandwidth within and between racks in the cloud data centre. Considering this situation, we propose a methodology, "improving data locality using ant colony optimization (IDLACO)," to minimize the number of non-local executions and virtual network bandwidth consumption when input split is assigned to more than one data block. First, IDLACO determines a set of data blocks for each map task of a MapReduce job to perform non-local executions to minimize the job latency and virtual network consumption. Then, the target virtual machine to execute map task is determined based on its heterogeneous performance. Finally, if a set of data blocks is transferred to the same node for repeated job execution, it is decided to temporarily cache them in the target virtual machine. The performance of IDLACO is analysed and compared with fair scheduler and Holistic scheduler based on the parameters, such as the number of non-local executions, average map task latency, job latency, and amount of bandwidth consumed for a MapReduce job. Results show that IDLACO significantly outperformed the classical fair scheduler and Holistic scheduler.
更多
查看译文
关键词
Task analysis, Bandwidth, Servers, Cloud computing, Virtual machining, Switches, Data transfer, Ant colony optimization, cloud computing, heterogeneous performance, MapReduce scheduler, virtualized environment
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要