Approximate QoS Rule Derivation Based on Root Cause Analysis for Cloud Computing

Satoshi Konno,Xavier Défago

2019 IEEE 24th Pacific Rim International Symposium on Dependable Computing (PRDC)(2019)

引用 5|浏览4
暂无评分
摘要
Ensuring proper quality of service (QoS) is essential for cloud service providers and customers alike. To this end, cloud systems must rely as much as possible on automated and efficient methods of monitoring, introspection, and recovery. In particular, automated recovery is essential to ensure long-term reliability and availability because human intervention is too slow and not every situation can be anticipated. In turn, automated recovery requires both efficient monitoring and accurate identification of root causes to ensure that the same causes will not lead to failures in the future. Current cloud systems use an in-memory time-series database for dynamic analysis or aggregation purposes. When done at all, root cause analysis serves the convenience of reporting and does not need to be very accurate. As a result, recent studies lack details on how to accurately find root causes from time-series monitoring data. This study proposes a novel event-driven monitoring rule inference method based on dynamic case-based reasoning and shape-based root cause analysis. It is designed for autonomous recovery so as to guarantee long-term QoS of cloud systems. The accuracy and performance of the approach are evaluated using realistic monitoring data combining more than a decade of experience as a major cloud service provider (Yahoo). The results show that our approach makes effective use of monitoring data in improving overall QoS and hence opens interesting directions.
更多
查看译文
关键词
Cloud computing, Monitoring, Time series database, Federated database, Root cause analysis, Autonomous recovery
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要