Elastic Stream Processing with Latency Guarantees
International Conference on Distributed Computing Systems(2015)
摘要
Many Big Data applications in science and industry have arisen, that require large amounts of streamed or event data to be analyzed with low latency. This paper presents a reactive strategy to enforce latency guarantees in data flows running on scalable Stream Processing Engines (SPEs), while minimizing resource consumption. We introduce a model for estimating the latency of a data flow, when the degrees of parallelism of the tasks within are changed. We describe how to continuously measure the necessary performance metrics for the model, and how it can be used to enforce latency guarantees, by determining appropriate scaling actions at runtime. Therefore, it leverages the elasticity inherent to common cloud technology and cluster resource management systems. We have implemented our strategy as part of the Nephele SPE. To showcase the effectiveness of our approach, we provide an experimental evaluation on a large commodity cluster, using both a synthetic workload as well as an application performing real-time sentiment analysis on real-world social media data.
更多查看译文
关键词
Autoscaling,Big Data,Elastic Scaling,Latency Constraint,Latency Guarantee,Stream Processing,Stream Processing Engine,Streaming
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要