Stab-FD: A cooperative and adaptive failure detector for wide area networks

Pierre Sens,Luciana Arantes, Anubis Graciela De Moraes Rossetto,Olivier Marin

JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING(2024)

引用 0|浏览1
暂无评分
摘要
Failure detectors (FDs) are a fundamental abstraction that plays a central role in the design of distributed systems. FDs are distributed oracles that provide processes with unreliable information about process failures, often in the form of a list of trusted or suspected process identities. In this article, we propose a timer-based FD which assesses the quality of its input links, and exchanges its local estimations with other nodes. Nodes use this information to adjust their timers dynamically. Capturing the variations in the quality of each link reduces the number of false suspicions without degrading failure detection time. We present experiments on a dataset of real traces collected on PlanetLab, and compare our approach to well-known state-of-the-art algorithms. Our results show that our new algorithms yield a good trade-off in terms of failure detection speed and accuracy in real scenarios.
更多
查看译文
关键词
Failure detectors,Quality of service,Fault tolerance,Distributed algorithms,Reliability
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要