Performance Variations in Resource Scaling for MapReduce Applications on Private and Public Clouds

IEEE CLOUD(2014)

引用 7|浏览29
暂无评分
摘要
In this paper, we delineate the causes of performance variations when scaling provisioned virtual resources for a variety of MapReduce applications. Hadoop MapReduce facilitates the development and execution processes of large-scale batch applications on big data. However, provisioning suitable resources to achieve desired performance at an affordable cost requires expertise into the execution model of MapReduce, the resources available for provisioning and the execution behavior of the application at hand. As an initial step towards automating this process, we characterize the difference in execution response for different MapReduce applications while varying the number of virtualized CPUs and memory resources, number of map slots as well as cluster size on a private cloud. This characterization helps illustrate the performance variation, 5x compared to 36x speedup, of Reduce-intensive and Map-intensive applications at effectively utilizing provisioned resources at different scales (1-64 VMs). By comparing the scalability efficiency, we clearly indicate the under-provisioning or over-provisioning of resources for different MapReduce applications at large scale.
更多
查看译文
关键词
provisioned virtual resource scaling,power aware computing,big data,cloud computing, mapreduce applications, dataset size, input scaling, parallel computing,parallel programming,vm,reduce-intensive applications,storage management,virtual machines,resource allocation,large-scale batch applications,virtualized cpu,map slots,hadoop mapreduce,virtualisation,memory resources,performance evaluation,performance variations,parallel computing,cloud computing,dataset size,cluster size,private cloud,map-intensive applications,scalability efficiency,mapreduce applications,input scaling,execution processes
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要