Fill-in the gaps: Spatial-temporal models for missing data

2017 13th International Conference on Network and Service Management (CNSM)(2017)

引用 9|浏览44
暂无评分
摘要
Effective workload characterization and prediction are instrumental for efficiently and proactively managing large systems. System management primarily relies on the workload information provided by underlying system tracing mechanisms that record system-related events in log files. However, such tracing mechanisms may temporarily fail due to various reasons, yielding “holes” in data traces. This missing data phenomenon significantly impedes the effectiveness of data analysis. In this paper, we study real-world data traces collected from over 80K virtual machines (VMs) hosted on 6K physical boxes in the data centers of a service provider. We discover that the usage series of VMs co-located on the same physical box exhibit strong correlation with one another, and that most VM usage series show temporal patterns. By taking advantage of the observed spatial and temporal dependencies, we propose a data-filling method to predict the missing data in the VM usage series. Detailed evaluation using trace data in the wild shows that the proposed method is sufficiently accurate as it achieves an average of 20% absolute percentage errors. We also illustrate its usefulness via a use case.
更多
查看译文
关键词
spatial-temporal models,system management,workload information,system tracing mechanisms,log files,data traces,missing data phenomenon,data analysis,real-world data,data centers,service provider,physical box,VM usage series,temporal dependencies,data-filling method,virtual machines,spatial dependencies,workload characterization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要