Integrative Dynamic Reconfiguration in a Parallel Stream Processing Engine
2017 IEEE 33rd International Conference on Data Engineering (ICDE)(2017)
摘要
Load balancing, operator instance collocations and horizontal scaling are critical issues in Parallel Stream Processing Engines to achieve low data processing latency, optimized cluster utilization and minimized communication cost respectively. In previous work, these issues are typically tackled separately and independently. We argue that these problems are tightly coupled in the sense that they all need to determine the allocations of workloads and migrate computational states at runtime. Optimizing them independently would result in suboptimal solutions. Therefore, in this paper, we investigate how these three issues can be modeled as one integrated optimization problem. In particular, we first consider jobs, where workload allocations have little effect on the communication cost, and model the problem of load balance as a Mixed-Integer Linear Program. Afterwards, we present an extended solution called ALBIC, which supports general jobs. We implement the proposed techniques on top of Apache Storm, an open-source Parallel Stream Processing Engine. The extensive experimental results over both synthetic and real datasets show that our techniques clearly outperform existing approaches.
更多查看译文
关键词
integrative dynamic reconfiguration,load balancing,operator instance collocations,horizontal scaling,data processing latency,optimized cluster utilization,minimized communication cost,optimization problem,workload allocations,mixed-integer linear program,ALBIC,Apache Storm,open-source parallel stream processing engine
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络