On The Latency-Accuracy Tradeoff In Approximate Mapreduce Jobs

IEEE INFOCOM 2017 - IEEE CONFERENCE ON COMPUTER COMMUNICATIONS(2017)

引用 7|浏览25
暂无评分
摘要
To ensure the scalability of big data analytics, approximate MapReduce platforms emerge to explicitly trade off accuracy for latency. A key step to determine optimal approximation levels is to capture the latency of big data jobs, which is long deemed challenging due to the complex dependency among data inputs and map/reduce tasks. In this paper, we use matrix analytic methods to derive stochastic models that can predict a wide spectrum of latency metrics, e.g., average, tails, and distributions, for approximate MapReduce jobs that are subject to strategies of input sampling and task dropping. In addition to capturing the dependency among waves of map/reduce tasks, our models incorporate two job scheduling policies, namely, exclusive and overlapping, and two task dropping strategies, namely, early and straggler, enabling us to realistically evaluate the potential performance gains of approximate computing. Our numerical analysis shows that the proposed models can guide big data platforms to determine the optimal approximation strategies and degrees of approximation.
更多
查看译文
关键词
data inputs,matrix analytic methods,approximate MapReduce jobs,input sampling,job scheduling policies,task dropping strategies,approximate computing,optimal approximation strategies,latency-accuracy tradeoff,approximate MapReduce platforms,stochastic models,complex data dependency,Big Data analytics scalability,Big Data jobs latency,latency metrics spectrum,early task dropping strategy,stragger task dropping strategy,performance gain evaluation,numerical analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要