Efficient Episode Mining with Minimal and Non-overlapping Occurrences

ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining(2010)

引用 45|浏览0
暂无评分
摘要
Frequent serial episodes within an event sequence describe the behavior of users or systems about the application. Existing mining algorithms calculate the frequency of an episode based on overlapping or non-minimal occurrences, which is prone to over-counting the support of long episodes or poorly characterizing the followed-by-closely relationship over event types. In addition, due to utilizing the Apriori-style level wise approach, these algorithms are computationally expensive. In this paper, we propose an efficient algorithm MANEPI (Minimal And Non-overlapping EPIsode) for mining more interesting frequent episodes within the given event sequence. The proposed frequency measure takes both minimal and non-overlapping occurrences of an episode into consideration and ensures better mining quality. The introduced depth first search strategy with the Apriori Property for performing episode growth greatly improves the efficiency of mining long episodes because of scanning the given sequence only once and not generating candidate episodes. Moreover, an optimization technique is presented to narrow down search space and speed up the mining process. Experimental evaluation on both synthetic and real-world datasets demonstrates that our algorithms are more efficient and effective.
更多
查看译文
关键词
depth first search,search space,data structures,data mining,algorithm design and analysis,prefix tree,databases,optimization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要