Ares: A High Performance and Fault-Tolerant Distributed Stream Processing System

2018 IEEE 26th International Conference on Network Protocols (ICNP)(2018)

引用 7|浏览53
暂无评分
摘要
Distributed Stream Processing Systems (DSPSs) have been widely deployed to process infinite data streams. Short processing latency and short recovery time are both vital for many DSPS applications. Existing DSPS designs commonly leverage elaborated task allocation strategies to achieve short processing latency. Such designs, however, ignore the requirement of system fault tolerance. Indeed, providing fault tolerant capability in a DSPS can cause significant degradation of system performance. Especially, the intrinsic dependency between upstream and down-stream tasks can incur cascaded waiting during recovery, leading to prohibitively long recovery time. In this paper, we propose Ares, a high performance and fault tolerant DSPS. Ares considers both system performance and fault tolerant capability during task allocation. In the design of Ares, we formalize the problem of Fault Tolerant Scheduler (FTS) for finding an optimal task allocation which maximizes the system utility. We use a game-theoretic approach to solve the FTS problem and propose a novel Nirvana algorithm based on best-response dynamics. We mathematically prove the existence of Nash equilibrium in the FTS game. We implement Ares atop Apache Storm and conduct comprehensive experiments to evaluate this design. The results show that, compared to existing designs Ares achieves a 3.6× improvement of throughput, as well as reducing the processing latency and the recovery time by 50.2% and 52.5%, respectively.
更多
查看译文
关键词
Distributed-stream-processing-system,Task-allocation,Fault-tolerance,Game-theory,Best-response-dynamics
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要