Deadline-based workload management for MapReduce environments: Pieces of the performance puzzle

Network Operations and Management Symposium(2012)

引用 107|浏览23
暂无评分
摘要
Hadoop and the associated MapReduce paradigm, has become the de facto platform for cost-effective analytics over “Big Data”. There is an increasing number of MapReduce applications associated with live business intelligence that require completion time guarantees. In this work, we introduce and analyze a set of complementary mechanisms that enhance workload management decisions for processing MapReduce jobs with deadlines. The three mechanisms we consider are the following: 1) a policy for job ordering in the processing queue; 2) a mechanism for allocating a tailored number of map and reduce slots to each job with a completion time requirement; 3) a mechanism for allocating and deallocating (if necessary) spare resources in the system among the active jobs. We analyze the functionality and performance benefits of each mechanism via an extensive set of simulations over diverse workload sets. The proposed mechanisms form the integral pieces in the performance puzzle of automated workload management in MapReduce environments.
更多
查看译文
关键词
competitive intelligence,distributed processing,resource allocation,Hadoop,MapReduce environments,automated workload management performance puzzle,big data,business intelligence,complementary mechanisms,cost-effective analytics,de facto platform,deadline-based workload management,job ordering policy,map tailored number allocation mechanism,processing queue,slot reduction,spare resource deallocating mechanism,MapReduce,Performance,Resource Allocation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要