Two Decades of Pattern Mining: Principles and Methods.

Lecture Notes in Business Information Processing（2017）

引用 2|浏览6

暂无评分

摘要

In 1993, Rakesh Agrawal, Tomasz Imielinski and Arun N. Swami published one of the founding papers of pattern mining: "Mining Association Rules Between Sets of Items in Large Databases". It aimed at enumerating the complete collection of regularities observed in a given dataset like for instance sets of products purchased together in a supermarket. For two decades, pattern mining has been one of the most active fields in Knowledge Discovery in Databases. This paper presents an overview of pattern mining. We first present the principles of language and interestingness that are two key dimensions for defining a pattern mining process to suit a specific task and a specific dataset. The language defines which patterns can be enumerated (itemsets, sequences, graphs). The interestingness measure defines the archetype of patterns to mine (regularities, contrasts or anomalies). Starting from a language and an interestingness measure, we depict the two main categories of pattern mining methods: enumerating all the patterns whose interestingness exceeds a user-specified threshold (satisfaction problem) or enumerating all the patterns whose interest is maximum (optimization problem). Finally, we present an overview of interactive pattern mining which aims at discovering the user's interest while mining relevant patterns.

查看译文

关键词

Data mining,Pattern mining

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要