A survey on intelligent management of alerts and incidents in IT services

Qingyang Yu,Nengwen Zhao,Mingjie Li,Zeyan Li, Honglin Wang, Wenchi Zhang, Kaixin Sui,Dan Pei

JOURNAL OF NETWORK AND COMPUTER APPLICATIONS(2024)

引用 0|浏览4
暂无评分
摘要
Modern service systems are constantly improving with the development of various IT technologies, leading to a boost in system scales and complex dependencies among service components. The large scale and complexity of services make them more prone to failure. To maintain services' normal and stable operation, alert and incident management (AIM), which analyzes and handles service failures in time, has become an important content of IT service management (ITSM). Many intelligent solutions have been proposed to improve the management process. However, there is currently no comprehensive survey that systematically reviews related works. Moreover, no integrated AIM architecture can cover each detailed process or most existing piecemeal solutions. Therefore, we conduct an in-depth survey to address these problems. To the best of our knowledge, the paper is the most comprehensive survey on intelligent AIM in IT services. Through this survey, we make the following contributions. First, we summarize an integrated architecture that includes detailed AIM processes and key techniques. Second, we provide a systematic review of related works based on the architecture. Third, we give a valuable analysis of current challenges and trends in AIM.
更多
查看译文
关键词
Service system,Alert management,Incident management,ITSM
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要