A Semantic-aware Representation Framework for Online Log Analysis

2020 29th International Conference on Computer Communications and Networks (ICCCN)(2020)

引用 28|浏览29
暂无评分
摘要
Logs are one of the most valuable data sources for large-scale service management. Log representation, which converts unstructured texts to structured vectors or matrices, serves as the the first step towards automated log analysis. However, the current log representation methods neither represent domain-specific semantic information of logs, nor handle the out-of-vocabulary (OOV) words of new types of logs at runtime. We propose Log2Vec, a semantic-aware representation framework for log analysis. Log2Vec combines a log-specific word embedding method to accurately extract the semantic information of logs, with an OOV word processor to embed OOV words into vectors at runtime. We present an analysis on the impact of OOV words and evaluate the performance of the OOV word processor. The evaluation experiments on four public production log datasets demonstrate that Log2Vec not only fixes the issue presented by OOV words, but also significantly improves the performance of two popular log-based service management tasks, including log classification and anomaly detection. We have packaged Log2Vec into an open-source toolkit and hope that it can be used for future research.
更多
查看译文
关键词
Log analysis,OOV words,AIOps
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要