Machine Learning-Based Run-Time Anomaly Detection In Software Systems: An Industrial Evaluation

Fabian Huch,Mojdeh Golagha,Ana Petrovska, Alexander Krauss

2018 IEEE WORKSHOP ON MACHINE LEARNING TECHNIQUES FOR SOFTWARE QUALITY EVALUATION (MALTESQUE)（2018）

引用 21|浏览2

暂无评分

摘要

Anomalies are an inevitable occurrence while operating enterprise software systems. Traditionally, anomalies are detected by threshold-based alarms for critical metrics, or health probing requests. However, fully automated detection in complex systems is challenging, since it is very difficult to distinguish truly anomalous behavior from normal operation. To this end, the traditional approaches may not be sufficient. Thus, we propose machine learning classifiers to predict the system's health status. We evaluated our approach in an industrial case study, on a large, real-world dataset of 7.5.10(6) data points for 231 features. Our results show that recurrent neural networks with long short-term memory (LSTM) are more effective in detecting anomalies and health issues, as compared to other classifiers. We achieved an area under precision-recall curve of 0.44. At the default threshold, we can automatically detect 70% of the anomalies. Despite the low precision of 31%, the rate in which false positives occur is only 4%.

查看译文

关键词

industrial case study,health issues,run-time anomaly detection,industrial evaluation,enterprise software systems,health probing requests,anomalous behavior,threshold-based alarms critical metrics,machine learning classifiers,system health status prediction,recurrent neural networks,long short-term memory,LSTM,precision-recall curve

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要